[slurm-users] Slurm versions 24.05.2, 23.11.9, and 23.02.8 are now available (security fix for switch plugins)

2024-07-31 Thread Tim Wickberg via slurm-users
Slurm versions 24.05.2, 23.11.9, and 23.02.8 are now available and 
include a fix for a recently discovered security issue with the switch 
plugins.


SchedMD customers were informed on July 17th and provided a patch on 
request; this process is documented in our security policy. [1]


For the switch/hpe_slingshot and switch/nvidia_imex plugins, a user 
could override the isolation between Slingshot VNIs or IMEX channels.


If you do not have one of these switch plugins configured, then you are 
not impacted by this issue.


It is unclear what, if any, information could be accessed with access to 
an unauthorized channel. This disclosure is being made out of an 
abundance of caution.


If you do have one of these plugins enabled, the slurmctld must be 
restarted before the slurmd daemons to avoid disruption.


Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

[1] https://www.schedmd.com/security-policy/

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 24.05.2
==
 -- Fix energy gathering rpc counter underflow in _rpc_acct_gather_energy when
more than 10 threads try to get energy at the same time. This prevented
the possibility to get energy from slurmd by any step until slurmd was
restarted, so losing energy accounting metrics in the node.
 -- accounting_storage/mysql - Fix issue where new user with wckey did not
have a default wckey sent to the slurmctld.
 -- slurmrestd - Prevent slurmrestd segfault when handling the following
endpoints when none of the optional parameters are specified:
  'DELETE /slurm/v0.0.40/jobs'
  'DELETE /slurm/v0.0.41/jobs'
  'GET /slurm/v0.0.40/shares'
  'GET /slurm/v0.0.41/shares'
  'GET /slurmdb/v0.0.40/instance'
  'GET /slurmdb/v0.0.41/instance'
  'GET /slurmdb/v0.0.40/instances'
  'GET /slurmdb/v0.0.41/instances'
  'POST /slurm/v0.0.40/job/{job_id}'
  'POST /slurm/v0.0.41/job/{job_id}'
 -- Fix IPMI energy gathering when no IPMIPowerSensors are specified in
acct_gather.conf. This situation resulted in an accounted energy of 0
for job steps.
 -- Fix a minor memory leak in slurmctld when updating a job dependency.
 -- scontrol,squeue - Fix regression that caused incorrect values for
multisocket nodes at '.jobs[].job_resources.nodes.allocation' for
'scontrol show jobs --(json|yaml)' and 'squeue --(json|yaml)'.
 -- slurmrestd - Fix regression that caused incorrect values for
multisocket nodes at '.jobs[].job_resources.nodes.allocation' to be dumped
with endpoints:
  'GET /slurm/v0.0.41/job/{job_id}'
  'GET /slurm/v0.0.41/jobs'
 -- jobcomp/filetxt - Fix truncation of job record lines > 1024 characters.
 -- Fixed regression that prevented compilation on FreeBSD hosts.
 -- switch/hpe_slingshot - Drain node on failure to delete CXI services.
 -- Fix a performance regression from 23.11.0 in cpu frequency handling when no
CpuFreqDef is defined.
 -- Fix one-task-per-sharing not working across multiple nodes.
 -- Fix inconsistent number of cpus when creating a reservation using the
TRESPerNode option.
 -- data_parser/v0.0.40+ - Fix job state parsing which could break filtering.
 -- Prevent cpus-per-task to be modified in jobs where a -c value has been
explicitly specified and the requested memory constraints implicitly
increase the number of CPUs to allocate.
 -- slurmrestd - Fix regression where args '-s v0.0.39,dbv0.0.39' and
'-d v0.0.39' would result in 'GET /openapi/v3' not registering as a valid
possible query resulting in 404 errors.
 -- slurmrestd - Fix memory leak for dbv0.0.39 jobs query which occurred if the
query parameters specified account, association, cluster, constraints,
format, groups, job_name, partition, qos, reason, reservation, state, users,
or wckey. This affects the following endpoints:
  'GET /slurmdb/v0.0.39/jobs'
 -- slurmrestd - In the case the slurmdbd does not respond to a persistent
connection init message, prevent the closed fd from being used, and instead
emit an error or warning depending on if the connection was required.
 -- Fix 24.05.0 regression that caused the slurmdbd not to send back an error
message if there is an error initializing a persistent connection.
 -- Reduce latency of forwarded x11 packets.
 -- Add "curr_dependency" (representing the current dependency of the job)
and "orig_dependency" (representing the original requested dependency of
the job) fields to the job record in job_submit.lua (for job update) and
jobcomp.lua.
 -- Fix potential segfault of slurmctld configured with
SlurmctldParameters=enable_rpc_queue from happening on reconfigure.
 -- Fix potential segfault of slurmctld on its shutdown when rate limitting
is enabled.
 -- slurmrestd - Fix missing job environment for SLURM_JOB_NAME,
SLURM_OPEN_MOD

[slurm-users] Slurm version 24.05.1 is now available

2024-06-27 Thread Tim Wickberg via slurm-users

We are pleased to announce the availability of Slurm version 24.05.1.

This release addresses a number of minor-to-moderate issues since the 
24.05 release was first announced a month ago.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim



* Changes in Slurm 24.05.1
==
 -- Fix slurmctld and slurmdbd potentially stopping instead of performing a
logrotate when recieving SIGUSR2 when using auth/slurm.
 -- switch/hpe_slingshot - Fix slurmctld crash when upgrading from 23.02.
 -- Fix "Could not find group" errors from validate_group() when using
AllowGroups with large /etc/group files.
 -- Prevent an assertion in debugging builds when triggering log rotation
in a backup slurmctld.
 -- Add AccountingStoreFlags=no_stdio which allows to not record the stdio
paths of the job when set.
 -- slurmrestd - Prevent a slurmrestd segfault when parsing the crontab field,
which was never usable. Now it explicitly ignores the value and emits a
warning if it is used for the following endpoints:
  'POST /slurm/v0.0.39/job/{job_id}'
  'POST /slurm/v0.0.39/job/submit'
  'POST /slurm/v0.0.40/job/{job_id}'
  'POST /slurm/v0.0.40/job/submit'
  'POST /slurm/v0.0.41/job/{job_id}'
  'POST /slurm/v0.0.41/job/submit'
  'POST /slurm/v0.0.41/job/allocate'
 -- mpi/pmi2 - Fix communication issue leading to task launch failure with
"invalid kvs seq from node".
 -- Fix getting user environment when using sbatch with "--get-user-env" or
"--export=" when there is a user profile script that reads /proc.
 -- Prevent slurmd from crashing if acct_gather_energy/gpu is configured but
GresTypes is not configured.
 -- Do not log the following errors when AcctGatherEnergyType plugins are used
but a node does not have or cannot find sensors:
"error: _get_joules_task: can't get info from slurmd"
"error: slurm_get_node_energy: Zero Bytes were transmitted or received"
However, the following error will continue to be logged:
"error: Can't get energy data. No power sensors are available. Try later"
 -- sbatch, srun - Set SLURM_NETWORK environment variable if --network is set.
 -- Fix cloud nodes not being able to forward to nodes that restarted with new
IP addresses.
 -- Fix cwd not being set correctly when running a SPANK plugin with a
spank_user_init() hook and the new "contain_spank" option set.
 -- slurmctld - Avoid deadlock during shutdown when auth/slurm is active.
 -- Fix segfault in slurmctld with topology/block.
 -- sacct - Fix printing of job group for job steps.
 -- scrun - Log when an invalid environment variable causes the job submission
to be rejected.
 -- accounting_storage/mysql - Fix problem where listing or modifying an
association when specifying a qos list could hang or take a very long time.
 -- gpu/nvml - Fix gpuutil/gpumem only tracking last GPU in step. Now,
gpuutil/gpumem will record sums of all GPUS in the step.
 -- Fix error in scrontab jobs when using slurm.conf:PropagatePrioProcess=1.
 -- Fix slurmctld crash on a batch job submission with "--nodes 0,...".
 -- Fix dynamic IP address fanout forwarding when using auth/slurm.
 -- Restrict listening sockets in the mpi/pmix plugin and sattach to the
SrunPortRange.
 -- slurmrestd - Limit mime types returned from query to 'GET /openapi/v3' to
only return one mime type per serializer plugin to fix issues with OpenAPI
client generators that are unable to handle multiple mime type aliases.
 -- Fix many commands possibly reporting an "Unexpected Message Received" when
in reality the connection timed out.
 -- Prevent slurmctld from starting if there is not a json serializer present
and the extra_constraints feature is enabled.
 -- Fix heterogeneous job components not being signaled with scancel --ctld and
'DELETE slurm/v0.0.40/jobs' if the job ids are not explicitly given,
the heterogeneous job components match the given filters, and the
heterogeneous job leader does not match the given filters.
 -- Fix regression from 23.02 impeding job licenses from being cleared.
 -- Move error to log_flag which made _get_joules_task error to be logged to the
user when too many rpcs were queued in slurmd for gathering energy.
 -- For scancel --ctld and the associated rest api endpoints:
  'DELETE /slurm/v0.0.40/jobs'
  'DELETE /slurm/v0.0.41/jobs'
Fix canceling the final array task in a job array when the task is pending
and all array tasks have been split into separate job records. Previously
this task was not canceled.
 -- Fix power_save operation after recovering from a failed reconfigure.
 -- slurmctld - Skip removing the pidfile when running under systemd. In that
situation it is never created in the first place.
 -- Fix issue where altering the flags on a Slurm account (UsersAreCoords)
several limits on the account's association would be set to 0 in
Slurm's internal cache.
 -- 

[slurm-users] Re: Convergence of Kube and Slurm?

2024-05-06 Thread Tim Wickberg via slurm-users
Note: I’m aware that I can run Kube on a single node, but we need more 
resources. So ultimately we need a way to have Slurm and Kube exist in 
the same cluster, both sharing the full amount of resources and both 
being fully aware of resource usage.


This is something that we (SchedMD) are working on, although it's a bit 
earlier than I was planning to publicly announce anything...


This is a very high-level view, and I have to apologize for stalling a 
bit, but: we've hired a team to build out a collection of tools that 
we're calling "Slinky" [1]. These provide for canonical ways of running 
Slurm within Kubernetes, ways of maintaining and managing the cluster 
state, and scheduling integration to allow for compute nodes to be 
available to both Kubernetes and Slurm environments while coordinating 
their status.


We'll be talking about it in more details at the Slurm User Group 
Meeting in Oslo [3], then KubeCon North America in Salt Lake, and SC'24 
in Atlanta. We'll have the (open-source, Apache 2.0 licensed) code for 
our first development phase available by SC'24 if not sooner.


There's a placeholder documentation page [4] that points to some of the 
presentations I've given before talking about approaches to tackling 
this converged-computing model, but I'll caution they're a bit dated and 
the Slinky-specific presentation we've been working on internally aren't 
publicly available yet.


If there are SchedMD support customers that have specific use cases, 
please feel free to ping your account managers if you'd like to chat at 
some point in the next few months.


- Tim

[1] Slinky is not an acronym (neither is Slurm [2]), but loosely stands 
for "Slurm in Kubernetes".


[2] https://slurm.schedmd.com/faq.html#acronym

[3] https://www.schedmd.com/about-schedmd/events/

[4] https://slurm.schedmd.com/slinky.html

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Slurm releases move to a six-month cycle

2024-03-26 Thread Tim Wickberg via slurm-users
Slurm major releases are moving to a six month release cycle. This 
change starts with the upcoming Slurm 24.05 release this May. Slurm 
24.11 will follow in November 2024. Major releases then continue every 
May and November in 2025 and beyond.


There are two main goals of this change:

- Faster delivery of newer features and functionality for customers.
- "Predictable" release timing, especially for those sites that would 
prefer to upgrade during an annual system maintenance window.


SchedMD will be adjusting our handling of backwards-compatibility within 
Slurm itself, and how SchedMD's support services will handle older releases.


For the 24.05 release, Slurm will still only support upgrading from (and 
mixed-version operations with) the prior two releases (23.11, 23.02). 
Starting with 24.11, Slurm will start supporting upgrades from the prior 
three releases (24.05, 23.11, 23.02).


SchedMD's Slurm Support has been built around an 18-month cycle. This 
18-month cycle has traditionally covered the current stable release, 
plus one prior major releases. With the increase in release frequency 
this support window will now cover to the current stable release, plus 
two prior major releases.


The blog post version of this announcement includes a table that 
outlines the updated support lifecycle:

https://www.schedmd.com/slurm-releases-move-to-a-six-month-cycle/

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: Mailing list upgrade - slurm-users list paused

2024-01-30 Thread Tim Wickberg via slurm-users

Welcome to the updated list. Posting is re-enabled now.

- Tim

On 1/30/24 11:56, Tim Wickberg wrote:

Hey folks -

The mailing list will be offline for about an hour as we upgrade the 
host, upgrade the mailing list software, and change the mail 
configuration around.


As part of these changes, the "From: " field will no longer be the 
original sender, but instead use the mailing list ID itself. This is to 
comply with DMARC sending options, and allow us to start DKIM signing 
messages to ensure deliverability once Google and Yahoo impose new 
policy changes in February.


This is the last post on the current (mailman2) list. I'll send a 
welcome message on the upgraded (mailman3) list once finished, and when 
the list is open to new traffic again.


- Tim



--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Mailing list upgrade - slurm-users list paused

2024-01-30 Thread Tim Wickberg

Hey folks -

The mailing list will be offline for about an hour as we upgrade the 
host, upgrade the mailing list software, and change the mail 
configuration around.


As part of these changes, the "From: " field will no longer be the 
original sender, but instead use the mailing list ID itself. This is to 
comply with DMARC sending options, and allow us to start DKIM signing 
messages to ensure deliverability once Google and Yahoo impose new 
policy changes in February.


This is the last post on the current (mailman2) list. I'll send a 
welcome message on the upgraded (mailman3) list once finished, and when 
the list is open to new traffic again.


- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm versions 23.11.1, 23.02.7, 22.05.11 are now available (CVE-2023-49933 through CVE-2023-49938)

2023-12-13 Thread Tim Wickberg
Slurm versions 23.11.1, 23.02.7, 22.05.11 are now available and address 
a number of recently-discovered security issues. They've been assigned 
CVE-2023-49933 through CVE-2023-49938.


SchedMD customers were informed on November 29th and provided a patch on 
request; this process is documented in our security policy. [1]


There are no mitigations available for these issues; the only option is 
to patch and restart the affected daemons.




Five issues were reported by Ryan Hall (Meta Red Team X):

1) Slurmd Message Integrity Bypass. (Slurm 23.02 and 23.11.)
   CVE-2023-49935

Permits an attacker to reuse root-level authentication tokens when 
interacting with the slurmd process, bypassing the RPC message hashes 
which protect against malicious MUNGE credential reuse.


2) Slurm Arbitrary File Overwrite. (Slurm 22.05 and 23.02.)
   CVE-2023-49938

Permits an attacker to modified their extended group list used with the 
sbcast subsystem, and open files with an incorrect set of extended groups.


3) Slurm NULL Pointer Dereference. (Slurm 22.05, 23.02, 23.11.)
   CVE-2023-49936

Denial of service.

4) Slurm Protocol Double Free. (Slurm 22.05, 23.02, 23.11.)
   CVE-2023-49937

Denial of service, potential for arbitrary code execution.

5) Slurm Protocol Message Extension. (Slurm 22.05, 23.02, 23.11.)
   CVE-2023-49933

Allows for malicious modification of RPC traffic that bypasses the 
message hash checks.


A sixth issue was discovered internally by SchedMD:

6) SQL Injection. (Slurm 23.11.)
   CVE-2023-49934

Arbitrary SQL injection against SlurmDBD's SQL database.



SchedMD only issues security fixes for the supported releases (currently 
23.11, 23.02 and 22.05). Due to the complexity of these fixes, we do not 
recommend attempting to back-port the fixes to older releases, and 
strongly encourage sites to upgrade to fixed versions immediately.


Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

[1] https://www.schedmd.com/security.php

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 23.11.1
==
 -- Fix scontrol update job=... TimeLimit+=/-= when used with a raw JobId of job
array element.
 -- Reject TimeLimit increment/decrement when called on job with
TimeLimit=UNLIMITED.
 -- Fix slurmctld segfault when reconfiguring after a job resize.
 -- Fix compilation on FreeBSD.
 -- Fix issue with requesting a job with --licenses as well as
--tres-per-task=license.
 -- slurmctld - Prevent segfault in getopt_long() with an invalid long option.
 -- Switch to man2html-base in Build-Depends for Debian package.
 -- slurmrestd - Added /meta/slurm/cluster field to responses.
 -- Adjust systemd service files to start daemons after remote-fs.target.
 -- Add "--with selinux" option to slurm.spec.
 -- Fix task/cgroup indexing tasks in cgroup plugins, which caused
jobacct/gather to match the gathered stats with the wrong task id.
 -- select/linear - Fix regression in 23.11 in which jobs that requested
--cpus-per-task were rejected.
 -- Fix crash in slurmstepd that can occur when launching tasks via mpi using
the pmi2 plugin and using the route/topology plugin.
 -- Fix sgather not gathering from all nodes when using CR_PACK_NODES/--m pack.
 -- Fix mysql query syntax error when getting jobs with private data.
 -- Fix sanity check to prevent deleting default account of users.
 -- data_parser/v0.0.40 - Fix the parsing for /slurmdb/v0.0.40/jobs exit_code
query parameter.
 -- Fix issue where TRES for energy wasn't always set before sending it to the
jobcomp plugin.
 -- jobcomp/[kafka|elastisearch] Print raw TRES values along with the
formatted versions as tres_[req|alloc]_raw.
 -- Fix inconsistencies with --cpu-bind/SLURM_CPU_BIND and --hint/SLURM_HINT.
 -- Fix ignoring invalid json in various subsystems.
 -- Remove shebang from bash completion script.
 -- Fix elapsed time in JobComp being set from invalid start and end times.
 -- Update service files to start slurmd, slurmctld, and slurmdbd after sssd.
 -- data_parser/v0.0.40 - Fix output of DefMemPerCpu, MaxMemPerCpu, and
max_shares.
 -- When determining a jobs index in the database don't wait if there are more
jobs waiting.
 -- If a job requests more shards which would allocate more than one sharing
GRES (gpu) per node refuse it unless SelectTypeparameters has
MULTIPLE_SHARING_GRES_PJ.
 -- Avoid refreshing the hwloc xml file when slurmd is reconfigured. This fixes
an issue seen with CoreSpecCount used on nodes with Intel E-cores.
 -- Trigger fatal exit when Slurm API function is called before slurm_init() is
called.
 -- slurmd - Fix issue with 'scontrol reconfigure' when started with '-c'.
 -- data_parser/v0.0.40 - Fix handling of negative job nice values.
 -- data_parser/v0.0.40 - Fill the "id" object for associations with the
cluster

[slurm-users] Slurm version 23.11 is now available

2023-11-21 Thread Tim Wickberg

We are pleased to announce the availability of the Slurm 23.11 release.

To highlight some new features in 23.11:

- Substantially overhauled the SlurmDBD association management code. For 
clusters updated to 23.11, account and user additions or removals are 
significantly faster than in prior releases.


- Overhauled 'scontrol reconfigure' to prevent configuration mistakes 
from disabling slurmctld / slurmd. Instead, an error will be returned, 
and the running configuration will persist. This does require updates to 
the systemd service files to use the --systemd option to slurmctld / slurmd.


- Added a new internal auth/cred plugin - "auth/slurm". This builds off 
the prior auth/jwt model, and permits operation of the slurmdbd and 
slurmctld without access to full directory information with a suitable 
configuration.


- Added a new --external-launcher option to srun, which is automatically 
set by common MPI launcher implementations and ensures processes using 
those non-srun launchers have full access to all resources allocated on 
each node.


- Reworked the dynamic/cloud modes of operation to allow for "fanout" - 
where Slurm communication can be automatically offloaded to compute 
nodes for increased cluster scalability.


- Added initial official Debian packaging support.

- Overhauled and extended the Reservation subsystem to allow for most of 
the same resource requirements as are placed on the job. Notably, this 
permits reservations to now reserve GRES directly.


The "Slurm 23.02, 23.11, and Beyond" presentation from the Slurm 
Community BoF at SC23 (https://slurm.schedmd.com/publications.html) has 
an overview of this release.


The Slurm documentation at https://slurm.schedmd.com has also been 
updated to the 23.11 release. (Older versions can be found in the 
archive, linked from the main documentation page.)


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm release candidate version 23.11rc1 available for testing

2023-11-07 Thread Tim Wickberg
We are pleased to announce the availability of Slurm release candidate 
version 23.11.0rc1.


To highlight some new features coming in 23.11:

- Substantially overhauled the SlurmDBD association management code. For 
clusters updated to 23.11, account and user additions or removals are 
significantly faster than in prior releases.


- Overhauled 'scontrol reconfigure' to prevent configuration mistakes 
from disabling slurmctld / slurmd. Instead, an error will be returned, 
and the running configuration will persist. This does require updates to 
the systemd service files to use the --systemd option to slurmctld / slurmd.


- Added a new internal auth/cred plugin - "auth/slurm". This builds off 
the prior auth/jwt model, and permits operation of the slurmdbd and 
slurmctld without access to full directory information with a suitable 
configuration.


- Added a new --external-launcher option to srun, which is automatically 
set by common MPI launcher implementations and ensures processes using 
those non-srun launchers have full access to all resources allocated on 
each node.


- Reworked the dynamic/cloud modes of operation to allow for "fanout" - 
where Slurm communication can be automatically offloaded to compute 
nodes for increased cluster scalability.


- Added initial official Debian packaging support.

- Overhauled and extended the Reservation subsystem to allow for most of 
the same resource requirements as are placed on the job. Notably, this 
permits reservations to now reserve GRES directly.


This is the first release candidate of the upcoming 23.11 release
series, and represents the end of development for this release, and a
finalization of the RPC and state file formats.

If any issues are identified with this release candidate, please report
them through https://bugs.schedmd.com against the 23.11.x version and we
will address them before the first production 23.11.0 release is made.

Please note that the release candidates are not intended for production use.

A preview of the updated documentation can be found at
https://slurm.schedmd.com/archive/slurm-master/ .

Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm versions 23.02.6 and 22.05.10 are now available (CVE-2023-41914)

2023-10-11 Thread Tim Wickberg
Slurm versions 23.02.6 and 22.05.10 are now available to address a 
number of filesystem race conditions that could let an attacker take 
control of an arbitrary file, or remove entire directories' contents 
(CVE-2023-41914).


SchedMD customers were informed on September 27th and provided a patch 
on request; this process is documented in our security policy [1].



CVE-2023-41914:

A number of race conditions have been identified within the 
slurmd/slurmstepd processes that can lead to the user taking ownership 
of an arbitrary file on the system. A related issue can lead to the user 
overwriting an arbitrary file on the compute node (although with data 
that is not directly under their control). A related issue can also lead 
to the user deleting all files and sub-directories of an arbitrary 
target directory on the compute node.


Thank you to François Diakhate (CEA) for reporting the original issue to 
us. A number of related issues were found during an extensive audit of 
Slurm's filesystem handling code in reaction to that report, and are 
included here in this same disclosure.



SchedMD only issues security fixes for the supported releases (currently 
23.02 and 22.05). Due to the complexity of these fixes, we do not 
recommend attempting to backport the fixes to older releases, and 
strongly encourage sites to upgrade to fixed versions immediately.


Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

[1] https://www.schedmd.com/security.php

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 23.02.6
==
 -- Fix CpusPerTres= not upgreadable with scontrol update
 -- Fix unintentional gres removal when validating the gres job state.
 -- Fix --without-hpe-slingshot configure option.
 -- Fix cgroup v2 memory calculations when transparent huge pages are used.
 -- Fix parsing of sgather --timeout option.
 -- Fix regression from 22.05.0 that caused srun --cpu-bind "=verbose" and "=v"
options give different CPU bind masks.
 -- Fix "_find_node_record: lookup failure for node" error message appearing
for all dynamic nodes during reconfigure.
 -- Avoid segfault if loading serializer plugin fails.
 -- slurmrestd - Correct OpenAPI format for 'GET /slurm/v0.0.39/licenses'.
 -- slurmrestd - Correct OpenAPI format for 'GET /slurm/v0.0.39/job/{job_id}'.
 -- slurmrestd - Change format to multiple fields in 'GET
/slurmdb/v0.0.39/assocations' and 'GET /slurmdb/v0.0.39/qos' to handle
infinite and unset states.
 -- When a node fails in a job with --no-kill, preserve the extern step on the
remaining nodes to avoid breaking features that rely on the extern step
such as pam_slurm_adopt, x11, and job_container/tmpfs.
 -- auth/jwt - Ignore 'x5c' field in JWKS files.
 -- auth/jwt - Treat 'alg' field as optional in JWKS files.
 -- Allow job_desc.selinux_context to be read from the job_submit.lua script.
 -- Skip check in slurmstepd that causes a large number of errors in the munge
log: "Unauthorized credential for client UID=0 GID=0".  This error will
still appear on slurmd/slurmctld/slurmdbd start up and is not a cause for
concern.
 -- slurmctld - Allow startup with zero partitions.
 -- Fix some mig profile names in slurm not matching nvidia mig profiles.
 -- Prevent slurmscriptd processing delays from blocking other threads in
slurmctld while trying to launch {Prolog|Epilog}Slurmctld.
 -- Fix sacct printing ReqMem field when memory doesn't exist in requested TRES.
 -- Fix how heterogenous steps in an allocation with CR_PACK_NODE or -mpack are
created.
 -- Fix slurmctld crash from race condition within job_submit_throttle plugin.
 -- Fix --with-systemdsystemunitdir when requesting a default location.
 -- Fix not being able to cancel an array task by the jobid (i.e. not
_) through scancel, job launch failure or prolog failure.
 -- Fix cancelling the whole array job when the array task is the meta job and
it fails job or prolog launch and is not requeable. Cancel only the
specific task instead.
 -- Fix regression in 21.08.2 where MailProg did not run for mail-type=end for
jobs with non-zero exit codes.
 -- Fix incorrect setting of memory.swap.max in cgroup/v2.
 -- Fix jobacctgather/cgroup collection of disk/io, gpumem, gpuutil TRES values.
 -- Fix -d singleton for heterogeneous jobs.
 -- Downgrade info logs about a job meeting a "maximum node limit" in the
select plugin to DebugFlags=SelectType. These info logs could spam the
slurmctld log file under certain circumstances.
 -- prep/script - Fix [Srun|Task] missing SLURM_JOB_NODELIST.
 -- gres - Rebuild GRES core bitmap for nodes at startup. This fixes error:
"Core bitmaps size mismatch on node [HOSTNAME]", which causes jobs to enter
state "Requested node configuration is not 

[slurm-users] SC'23 Presentations Online; SLUG'24 will be at the University of Oslo Sept. 2024

2023-09-24 Thread Tim Wickberg

Presentations from SLUG'23 at BYU are in the publication archive now:
https://slurm.schedmd.com/publications.html

A huge thanks for all attendees, presenters, and the Brigham Young 
University for hosting a successful event. I look forward to seeing many 
of you again at SC'23 in Denver, or other events in the upcoming year.


Speaking of SC'23: the Slurm Booth will be back once again - #463 on the 
show floor. The Slurm Birds-of-a-Feather session will be held on 
Thursday, November 16th, in rooms 201-203.


The Slurm User Group Meeting ("SLUG'24") will be held in person at the 
University of Oslo in September 2024. We're still working to finalize 
the exact dates, and will have a call for presentations out in the spring.


- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 23.02 is now available

2023-02-28 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 23.02.

To highlight some new features in 23.02:

- Added a new (optional) RPC rate limiting system in slurmctld.
- Added usage gathering for gpu/nvml (Nvidia) and gpu/rsmi (AMD)
  plugins.
- Added a new jobcomp/kafka plugin.
- Overhauled the 'remote resources' (licenses) functionality managed
  through sacctmgr / slurmdbd, and introduce a new 'lastconsumed' field
  that is intended to be frequently updated with the current usage as
  reported by 'lmstat' or similar tools. This allows for better
  cooperative license usage, especially for systems with external
  workstations and other use that is not under Slurm's control.
- Added a new 'scrun' command which can serve as a OCI runtime proxy.
- Support for --json/--yaml output from most Slurm commands, which has
  been extended to support additional filtering options.
- Extended "configless" operation to allow for propagation of files
  referenced in "Include" directives.
- Allowed Slurm to automatically create directories for stdout/stderr
  files. These can include substitution directives such as %j, %a,
  and %x.

The main Slurm documentation site at https://slurm.schedmd.com/ has been 
updated to the new release now as well.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm release candidate version 23.02rc1 available for testing

2023-02-14 Thread Tim Wickberg
We are pleased to announce the availability of Slurm release candidate 
version 23.02rc1.


To highlight some new features coming in 23.02:

- Added a new (optional) RPC rate limiting system in slurmctld.
- Added usage gathering for gpu/nvml (Nvidia) and gpu/rsmi (AMD)
  plugins.
- Added a new jobcomp/kafka plugin.
- Overhauled the 'remote resources' (licenses) functionality managed
  through sacctmgr / slurmdbd, and introduce a new 'lastconsumed' field
  that is intended to be frequently updated with the current usage as
  reported by 'lmstat' or similar tools. This allows for better
  cooperative license usage, especially for systems with external
  workstations and other use that is not under Slurm's control.
- Added a new 'scrun' command which can serve as a OCI runtime proxy.
- Support for --json/--yaml output from most Slurm commands, which has
  been extended to support additional filtering options.
- Extended "configless" operation to allow for propagation of files
  referenced in "Include" directives.
- Allow Slurm to create directories for stdout/stderr files. These can
  include substitution directives such as %j, %a, and %x.

This is the first release candidate of the upcoming 23.02 release 
series, and represents the end of development for this release, and a 
finalization of the RPC and state file formats.


If any issues are identified with this release candidate, please report 
them through https://bugs.schedmd.com against the 23.02.x version and we 
will address them before the first production 23.02.0 release is made.


Please note that the release candidates are not intended for production use.

A preview of the updated documentation can be found at 
https://slurm.schedmd.com/archive/slurm-master/ .


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] SC'22 Presentations Online; SLUG'23 will be at BYU Sept. 2023

2022-12-05 Thread Tim Wickberg

Two quick announcements I wanted to share:

Presentations from SC'22 in Dallas are in the publication archive now:
https://slurm.schedmd.com/publications.html

The Slurm User Group Meeting ("SLUG'23") will be held in person in 
Provo, Utah, at Brigham Young University in September 2023. We're still 
working to finalize the exact dates, and will have a call for 
presentations out in the spring.


- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm at SC'22 next week - booth 1043

2022-11-09 Thread Tim Wickberg
For those of you attending the SC'22 conference in Dallas next week, the 
Slurm booth will be back on the show floor in #1043.


We'll have t-shirts to hand out as usual, alongside a full presentation 
schedule both at the Slurm booth as well as at several partner 
organizations.


In addition to the text summary below, the agenda (which will be updated 
if anything changes) can be seen at: https://tinyurl.com/slurmatsc22


== Slurm Booth at SC'22 ==

Tuesday, November 15th:

10:30am - Introduction to Slurm
11:30am - Slurm 22.05, 23.02, and Beyond
1:15pm - ORNL - Matt Ezell
2:15pm - Introduction to Slurm
3:15pm - Slurm and/or/vs Kubernetes Forum
4:15pm - Amazon AWS

Wednesday, November 16th:

10:30am - Introduction to Slurm
11:30am - Slurm 22.05, 23.02, and Beyond
1:15pm - Google GCP - Wyatt Gorman
2:15pm - Introduction to Slurm
3:15pm - NERSC - Doug Jacobsen
4:15pm - Slurm and/or/vs Kubernetes Forum

Thursday, November 17th:

10:30am - Introduction to Slurm
11:30am - Slurm 22.05, 23.02, and Beyond

== Slurm at Partners at SC'22 ==

Tuesday, November 15th:

11:00am - Slurm on GCP (at Google Booth - 3210)
2:00pm - Slurm for TPUs (at Google Booth - 3210)
4:00pm - Slurm on AWS (at Amazon Booth - 2425)
5:00pm - Slurm and Dell (at Dell Booth - 2443)

Wednesday, November 16th:

10:30am - Slurm and Lenovo (at Lenovo Booth - 1204)
11:30am - Slurm for TPUs (at Google Booth - 3210)
12:00pm - Slurm on AWS (at Amazon Booth - 2425)
1:30pm - Slurm on GCP (at Google Booth - 3210)

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 22.05.5 is now available

2022-10-13 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 22.05.5.

This fixes a number of moderate severity issues, alongside one 
unfortunate problem with the upgrade process for running jobs with the 
slurmstepd when using RPM-based installations. Please see Jason Booth's 
email the slurm-users mailing list for further details, and ways to 
mitigate this problem:


https://lists.schedmd.com/pipermail/slurm-users/2022-September/009222.html
Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 22.05.5
==
 -- Fix node becoming IDLE while in an invalid registration state.
 -- When a job is completing avoid potential dereference.
 -- Avoid setting preempt_time for a job erroneously.
 -- Fix situation where we don't requeue correctly when a job is finishing.
 -- job_container/tmpfs - Avoid leaking namespace file descriptor.
 -- common/slurm_opt - fix memory leak in client commands or slurmrestd when the
--chdir option is set after option reset.
 -- openapi/dbv0.0.38 - gracefully handle unknown associations assigned to jobs.
 -- openapi/dbv0.0.38 - query all associations to avoid errors while dumping
jobs.
 -- Load hash plugin at slurmstepd launch time to prevent issues loading the
plugin at step completion if the Slurm installation is upgraded.
 -- Fix gcc 12.2.1 compile errors.
 -- Fix future magnetic reservations preventing heterogeneous jobs from
starting. >  -- Prevent incorrect error message from being generated for 

operator/admins

using the 'scontrol top' command.
 -- slurmrestd - correct issue where larger requests could result in a single
byte getting removed from inside of the POST request.
 -- Fix regression in task count calculation for --ntasks-per-gpu with multiple
nodes.
 -- Update nvml plugin to match the unique id format for MIG devices in new
Nvidia drivers.
 -- Fix segfault on backup slurmdbd if no QoS is present in DB.
 -- Fix clang 11 compile errors.
 -- Fix task distribution calculations across sockets with
--distribution=cyclic.
 -- Fix task distribution calculations with --ntasks-per-gpu specified without
an explicit --ntasks value.
 -- Fix job arrays not showing correct features.
 -- Fix job having wrong features used when using preferred features.
 -- Fix task/cray_aries error finishing an interactive step, avoiding correct
cleanup.
 -- Correctly set max_nodes when --ntasks=1.
 -- Fix configure script on FreeBSD.




[slurm-users] Slurm version 22.05.4 is now available

2022-09-29 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 22.05.4.

This includes fixes to two potential crashes in the backfill scheduler, 
alongside a number of other moderate severity issues.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 22.05.4
==
 -- Fix return code from salloc when the job is revoked prior to executing user
command.
 -- Fix minor memory leak when dealing with gres with multiple files.
 -- Fix printing for no_consume gres in scontrol show job.
 -- sinfo - Fix truncation of very large values when outputting memory.
 -- Fix multi-node step launch failure when nodes in the controller aren't in
natural order. This can happen with inconsistent node naming (such as
node15 and node052) or with dynamic nodes which can register in any order.
 -- job_container/tmpfs - Prevent reading the plugin config multiple times per
step.
 -- Fix wrong attempt of gres binding for gres w/out cores defined.
 -- Fix build to work with '--without-shared-libslurm' configure flag.
 -- Fix power_save mode when repeatedly configuring too fast.
 -- Fix sacct -I option.
 -- Prevent jobs from being scheduled on future nodes.
 -- Fix memory leak in slurmd happening on reconfigure when CPUSpecList used.
 -- Fix sacctmgr show event [min|max]cpus.
 -- Fix regression in 22.05.0rc1 where a prolog or epilog that redirected stdout
to a file could get erroneously killed, resulting in job launch failure
(for the prolog) and the node being drained.
 -- cgroup/v1 - Make a static variable to remove potential redundant checking
for if the system has swap or not.
 -- cgroup/v1 - Add check for swap when running OOM check after task
termination.
 -- job_submit/lua - add --prefer support
 -- cgroup/v1 - fix issue where sibling steps could incorrectly be accounted as
OOM when step memory limit was the same as the job allocation. Detect OOM
events via memory.oom_control oom_kill when exposed by the kernel instead of
subscribing notifications with eventfd.
 -- Fix accounting of oom_kill events in cgroup/v2 and task/cgroup.
 -- Fix segfault when slurmd reports less than configured gres with links after
a slurmctld restart.
 -- Fix TRES counts after node is deleted using scontrol.
 -- sched/backfill - properly handle multi-reservation HetJobs.
 -- sched/backfill - don't try to start HetJobs after system state change.
 -- openapi/v0.0.38 - add submission of job->prefer value.
 -- slurmdbd - become SlurmUser at the same point in logic as slurmctld to match
plugins initialization behavior. This avoids a fatal error when starting
slurmdbd as root and root cannot start the auth or accounting_storage
plugins (for example, if root cannot read the jwt key).
 -- Fix memory leak when attempting to update a job's features with invalid
features.
 -- Fix occasional slurmctld crash or hang in backfill due to invalid pointers.
 -- Fix segfault on Cray machines if cgroup cpuset is used in cgroup/v1.




[slurm-users] SLUG'22 this Tuesday on the SchedMD YouTube channel

2022-09-15 Thread Tim Wickberg
The Slurm User Group Meeting 2022 (SLUG'22) will be help this Tuesday, 
and streamed through the SchedMD YouTube channel.


https://www.youtube.com/c/schedmdslurm

There will be four live presentations by SchedMD staff, alongside three 
community presentations (pre-recorded, but with live chat available to 
ask questions and interact with other participants).


The SLUG'22 playlist is:

https://www.youtube.com/playlist?list=PLZfwi0jHMBxDkdmRn91ImqnweZH9-1eIF

The agenda is:
https://slurm.schedmd.com/slurm_ug_agenda.html

(Direct links to each presentation will be filled in shortly.)

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 22.05.3 is now available

2022-08-11 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 22.05.3.

This release includes a number of low to moderate severity fixes made 
since the last maintenance release was made in June.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 22.05.3
==
 -- job_container/tmpfs - cleanup containers even when the .ns file isn't
mounted anymore.
 -- Ignore the bf_licenses option if using sched/builtin.
 -- Do not clear the job's requested QOS (qos_id) when ineligible due to QOS.
 -- Emit error and add fail-safe when job's qos_id changes unexpectedly.
 -- Fix timeout value in log.
 -- openapi/v0.0.38 - fix setting of DefaultTime when dumping a partition.
 -- openapi/dbv0.0.38 - correct parsing association QOS field.
 -- Fix LaunchParameters=mpir_use_nodeaddr.
 -- Fix various edge cases where accrue limits could be exceeded or cause
underflow error messages.
 -- Fix issue where a job requesting --ntasks and --nodes could be wrongly
rejected when spanning heterogeneous nodes.
 -- openapi/v0.0.38 - detect when partition PreemptMode is disabled
 -- openapi/v0.0.38 - add QOS flag to handle partition PreemptMode=within
 -- Add total_cpus and total_nodes values to the partition list in
the job_submit/lua plugin.
 -- openapi/dbv0.0.38 - reject and error on invalid flag values in well defined
flag fields.
 -- openapi/dbv0.0.38 - correct QOS preempt_mode flag requests being silently
ignored.
 -- accounting_storage/mysql - allow QOS preempt_mode flag updates when GANG
mode is requested.
 -- openapi/dbv0.0.38 - correct QOS flag modifications request being silently
ignored.
 -- sacct/sinfo/squeue - use openapi/[db]v0.0.38 for --json and --yaml modes.
 -- Improve error messages when using configless and fetching the config fails.
 -- Fix segfault when reboot_from_controller is configured and scontrol reboot
is used.
 -- Fix regression which prevented a cons_tres gpu job to be submitted to a
cons_tres cluster from a non-con_tres cluster.
 -- openapi/dbv0.0.38 - correct association QOS list parsing for updates.
 -- Fix rollup incorrectly divying up unused reservation time between
associations.
 -- slurmrestd - add SLURMRESTD_SECURITY=disable_unshare_files environment
variable.
 -- Update rsmi detection to handle new default library location.
 -- Fix header inclusion from slurmstepd manager code leading to multiple
definition errors when linking --without-shared-libslurm.
 -- slurm.spec - explicitly disable Link Time Optimization (LTO) to avoid
linking errors on systems where LTO-related RPM macros are enabled by
default and the binutils version has a bug.
 -- Fix issue in the api/step_io message writing logic leading to incorrect
behavior in API consuming clients like srun or sattach, including a segfault
when freeing IO buffers holding traffic from the tasks to the client.
 -- openapi/dbv0.0.38 - avoid job queries getting rejected when cluster is not
provided by client.
 -- openapi/dbv0.0.38 - accept job state filter as verbose names instead of
only numeric state ids.
 -- Fix regression in 22.05.0rc1: if slurmd shuts down while a prolog is
running, the job is cancelled and the node is drained.
 -- Wait up to PrologEpilogTimeout before shutting down slurmd to allow prolog
and epilog scripts to complete or timeout. Previously, slurmd waited 120
seconds before timing out and killing prolog and epilog scripts.
 -- GPU - Fix checking frequencies to check them all and not skip the last one.
 -- GPU - Fix logic to set frequencies properly when handling multiple GPUs.
 -- cgroup/v2 - Fix typo in error message.
 -- cgroup/v2 - More robust pattern search for events.
 -- Fix slurm_spank_job_[prolog|epilog] failures being masked if a Prolog or
Epilog script is defined (regression in 22.05.0rc1).
 -- When a job requested nodes and can't immediately start, only report to
the user (squeue/scontrol et al) if nodes are down in the requested list.
 -- openapi/dbv0.0.38 - Fix qos list/preempt not being parsed correctly.
 -- Fix dynamic nodes registrations mapping previously assigned nodes.
 -- Remove unnecessarily limit on count of 'shared' gres.
 -- Fix shared gres on CLOUD nodes not properly initializing.




[slurm-users] Slurm User Group Meeting '22 changes to virtual - September 20th

2022-07-22 Thread Tim Wickberg
The Slurm User Group Meeting '22 will be going virtual once again, and 
streamed through the SchedMD YouTube channel on Tuesday, September 20th.


While we had hoped to hold the meeting in person, we've made the 
difficult decision to revert to a virtual forum again on our usual 
September cadence. We are planning to visit BYU in-person - just a bit 
later than originally planned - in September 2023 instead.


A detailed agenda will be sent out towards the end of August, but I'm 
happy to report that we'll be integrating four community talks in 
addition to several presentations by SchedMD staff.


cheers,
- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



Re: [slurm-users] Slurm User Group Meeting at BYU, Oct. 19-20th - Call For Presentations

2022-07-06 Thread Tim Wickberg
One final reminder before submissions for SLUG presentations close 
Sunday night.


- Tim

On 6/21/22 12:44, Tim Wickberg wrote:
As a reminder - the initial call for presentations for SLUG'22 ends on 
July 10th.


If you are interested in presenting please send your abstracts to 
sl...@schedmd.com.


- Tim

On 5/24/22 12:05, Tim Wickberg wrote:
SchedMD will be hosting the Slurm User Group Meeting 2022 (SLUG'22) 
in-person at Brigham Young University in Provo, Utah, this fall on 
October 19 - 20th.


Additional details will be sent out as the scheduled is finalized, but 
at this time we want to open up the call for abstracts:


You are invited to submit an abstract of a tutorial, technical 
presentation or site report to be given at the 2022 Slurm User Group 
Meeting. This event is sponsored and organized by Brigham Young 
University and SchedMD. This international event is opened to those 
who want to:


   * Learn more about Slurm, the premier HPC workload manager
   * Share their knowledge and experience with other users and sysadmins
   * Get detailed information about the latest features and developments
   * Share requirements and discuss future developments

Everyone who wants to present their own usage, developments, site 
report, or tutorial about Slurm is invited to send an abstract to 
sl...@schedmd.com.


*Important Dates:*
10 July 2022: Abstracts due
19 July 2022: Notification of acceptance

*Slurm User Group Meeting 2022*
19-20 October 2022
Provo, Utah




Re: [slurm-users] Slurm User Group Meeting at BYU, Oct. 19-20th - Call For Presentations

2022-06-21 Thread Tim Wickberg
As a reminder - the initial call for presentations for SLUG'22 ends on 
July 10th.


If you are interested in presenting please send your abstracts to 
sl...@schedmd.com.


- Tim

On 5/24/22 12:05, Tim Wickberg wrote:
SchedMD will be hosting the Slurm User Group Meeting 2022 (SLUG'22) 
in-person at Brigham Young University in Provo, Utah, this fall on 
October 19 - 20th.


Additional details will be sent out as the scheduled is finalized, but 
at this time we want to open up the call for abstracts:


You are invited to submit an abstract of a tutorial, technical 
presentation or site report to be given at the 2022 Slurm User Group 
Meeting. This event is sponsored and organized by Brigham Young 
University and SchedMD. This international event is opened to those who 
want to:


   * Learn more about Slurm, the premier HPC workload manager
   * Share their knowledge and experience with other users and sysadmins
   * Get detailed information about the latest features and developments
   * Share requirements and discuss future developments

Everyone who wants to present their own usage, developments, site 
report, or tutorial about Slurm is invited to send an abstract to 
sl...@schedmd.com.


*Important Dates:*
10 July 2022: Abstracts due
19 July 2022: Notification of acceptance

*Slurm User Group Meeting 2022*
19-20 October 2022
Provo, Utah




[slurm-users] Slurm version 22.05.2 is now available

2022-06-16 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 22.05.2.

This includes one significant fix to prevent a potential slurmctld crash 
if an array job is submitted with a "--gres" request.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 22.05.2
==
 -- Fix a segfault in slurmctld when requesting gres in job arrays.
 -- Prevent jobs from launching on newly powered up nodes that register with
invalid config.
 -- Fix a segfault when there's no memory.swap.current interface in cgroup/v2.
 -- Fix memleak in cgroup/v2.




[slurm-users] Slurm version 22.05.1 is now available

2022-06-14 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 22.05.1.

This includes one significant fix to an regression introduced in 22.05.0 
issue that can lead to over-subscription of licenses. For sites running 
22.05.0 the new "bf_licenses" option to SchedulerParameters will resolve 
this issue, otherwise upgrading to this new maintenance release is 
strongly encouraged.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 22.05.1
==
 -- Flush the list of Include config files on SIGHUP.
 -- Fix and update Slurm completion script.
 -- jobacct_gather/cgroup - Add VMem support both for cgroup v1 and v2.
 -- Allow subset of node state transitions when node is in INVAL state.
 -- Remove INVAL state from cloud node after being powered down.
 -- When showing reason UID in scontrol show node, use the authenticated UID
instead of the login UID.
 -- Fix calculation of reservation's NodeCnt when using dynamic nodes.
 -- Add SBATCH_{ERROR,INPUT,OUTPUT} input environment variables for --error,
--input and --output options respectively.
 -- Prevent oversubscription of licenses by the backfill scheduler when not
using the new "bf_licenses" option.
 -- Jobs with multiple nodes in a heterogeneous cluster now have access to all
the memory on each node by using --mem=0. Previously the memory limit was
set by the node with the least amount of memory.
 -- Don't limit the size of TaskProlog output (previously TaskProlog output was
limited to 4094 characters per line, which limited the size of exported
environment variables or logging to the task).
 -- Fix usage of possibly uninitialized buffer in proctrack/cgroup.
 -- Fix memleak in proctrack/cgroup proctrack_p_wait.
 -- Fix cloud/remote het srun jobs.
 -- Fix a segfault that may happen on gpu configured as no_consume.




[slurm-users] Slurm version 22.05 is now available

2022-05-26 Thread Tim Wickberg

We are pleased to announce the availability of Slurm release 22.05.0.

To highlight some new features in 22.05:

- Support for dynamic node addition and removal
  (https://slurm.schedmd.com/dynamic_nodes.html)
- Support for native Linux cgroup v2 operation
- Newly added plugins to support HPE Slingshot 11 networks
  (switch/hpe_slingshot), and Intel Xe GPUs (gpu/oneapi)
- Added new acct_gather_interconnect/sysfs plugin to collect statistics
  from arbitrary network interfaces.
- Expanded and synced set of environment variables available in the
  Prolog/Epilog/PrologSlurmctld/EpilogSlurmctld scripts.
- New "--prefer" option to job submissions to allow for a "soft
  constraint" request to influence node selection.
- Optional support for license planning in the backfill scheduler with
  "bf_licenses" option in SchedulerParameters.

The main Slurm documentation site at https://slurm.schedmd.com/ has been 
updated now as well.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm User Group Meeting at BYU, Oct. 19-20th - Call For Presentations

2022-05-24 Thread Tim Wickberg
SchedMD will be hosting the Slurm User Group Meeting 2022 (SLUG'22) 
in-person at Brigham Young University in Provo, Utah, this fall on 
October 19 - 20th.


Additional details will be sent out as the scheduled is finalized, but 
at this time we want to open up the call for abstracts:


You are invited to submit an abstract of a tutorial, technical 
presentation or site report to be given at the 2022 Slurm User Group 
Meeting. This event is sponsored and organized by Brigham Young 
University and SchedMD. This international event is opened to those who 
want to:


  * Learn more about Slurm, the premier HPC workload manager
  * Share their knowledge and experience with other users and sysadmins
  * Get detailed information about the latest features and developments
  * Share requirements and discuss future developments

Everyone who wants to present their own usage, developments, site 
report, or tutorial about Slurm is invited to send an abstract to 
sl...@schedmd.com.


*Important Dates:*
10 July 2022: Abstracts due
19 July 2022: Notification of acceptance

*Slurm User Group Meeting 2022*
19-20 October 2022
Provo, Utah



[slurm-users] Slurm release candidate version 22.05rc1 available for testing

2022-05-12 Thread Tim Wickberg
We are pleased to announce the availability of Slurm release candidate 
version 22.05rc1.


To highlight some new features coming in 22.05:

- Support for dynamic node addition and removal
- Support for native cgroup/v2 operation
- Newly added plugins to support HPE Slingshot 11 networks 
(switch/hpe_slingshot), and Intel Xe GPUs (gpu/oneapi)
- Added new acct_gather_interconnect/sysfs plugin to collect statistics 
from arbitrary network interfaces.
- Expanded and synced the set of environment variables available in the 
Prolog/Epilog/PrologSlurmctld/EpilogSlurmctld scripts.
- Added "--prefer" option to job submissions to allow for a "soft 
constraint" request to influence node selection.


This is the first release candidate of the upcoming 22.05 release 
series, and represents the end of development for this release, and a 
finalization of the RPC and state file formats.


If any issues are identified with this release candidate, please report 
them through https://bugs.schedmd.com against the 22.05.x version and we 
will address them before the first production 22.05.0 release is made.


Please note that the release candidates are not intended for production use.

A preview of the updated documentation can be found at 
https://slurm.schedmd.com/archive/slurm-master/ .


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



Re: [slurm-users] Slurm versions 21.08.8 and 20.11.9 are now available (CVE-2022-29500, 29501, 29502)

2022-05-05 Thread Tim Wickberg

And, what is hopefully my final update on this:

Unfortunately I missed including a single last-minute commit in the 
21.08.8 release. That missing commit fixes a communication issue between 
a mix of patched and unpatched slurmd processes that could lead to nodes 
being incorrectly marked as offline.


That patch was included in 20.11.9. That missing commit is included in a 
new 21.08.8-2 release which is on our download page now.


If you've already starting rolling out 21.08.8 on your systems, the best 
path forward it to restart all slurmd processes in the cluster immediately.


- Tim



Re: [slurm-users] Slurm versions 21.08.8 and 20.11.9 are now available (CVE-2022-29500, 29501, 29502)

2022-05-05 Thread Tim Wickberg
I wanted to provide some elaboration on the new 
CommunicationParameters=block_null_hash option based on initial feedback.


The original email said it was safe to enable after all daemons had been 
restarted. Unfortunately that statement was incomplete - the flag can 
only be safely enabled after all daemons have been restarted *and* all 
currently running jobs have completed.


The new maintenance releases - with or without this new option enabled - 
do fix the reported issues. The option is not required to secure your 
system.


This option provides an additional - redundant - layer of security 
within the cluster, and we do encourage sites to enable it at their 
earliest convenience, but only after currently running jobs (with an 
associated unpatched slurmstepd process) have all completed.


- Tim



[slurm-users] Slurm versions 21.08.8 and 20.11.9 are now available (CVE-2022-29500, 29501, 29502)

2022-05-04 Thread Tim Wickberg
Slurm versions 21.08.8 and 20.11.9 are now available to address a 
critical security issue with Slurm's authentication handling.


SchedMD customers were informed on April 20th and provided a patch on 
request; this process is documented in our security policy [1].


For SchedMD customers: please note that there are additional changes 
included in these releases to address recently reported problems with 
PMIx, and to fix communication issues between patched and unpatched 
slurmd processes.




CVE-2022-29500:

An architectural flaw with how credentials are handled can be exploited 
to allow an unprivileged user to impersonate the SlurmUser account. 
Access to the SlurmUser account can be used to execute arbitrary 
processes as root.


This issue impacts all Slurm releases since at least Slurm 1.0.0.

Systems remain vulnerable until all slurmdbd, slurmctld, and slurmd 
processes have been restarted in the cluster.


Once all daemons have been upgraded sites are encouraged to add 
"block_null_hash" to CommunicationParameters. That new option provides 
additional protection against a potential exploit.


CVE-2022-29501:

An issue was discovered with a network RPC handler in the slurmd daemon 
used for PMI2 and PMIx support. This vulnerability could allow an 
unprivileged user to send data to an arbitrary unix socket on the host 
as the root user.


CVE-2022-29502:

An issue was found with the I/O key validation logic in the srun client 
command that could permit an attacker to attach to the user's terminal, 
and intercept process I/O. (Slurm 21.08 only.)




Due to the severity of the CVE-2022-29500 issue, SchedMD has removed all 
prior Slurm releases from our download site.


SchedMD only issues security fixes for the supported releases (currently 
21.08 and 20.11). Due to the complexity of these fixes, we do not 
recommend attempting to backport the fixes to older releases, and 
strongly encourage sites to upgrade to fixed versions immediately.


Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

[1] https://www.schedmd.com/security.php

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 21.08.8
==
 -- openapi/dbv0.0.37 - fix slurmrestd fatal() when deleting an association.
 -- Allow scontrol update  Gres=... to not require "gres:".
 -- Fix inconsistent reboot message appending behavior.
 -- Fix incorrect reason_time and reason_uid on reboot message.
 -- Fix "scontrol reboot" clearing node reason on ResumeTimeout.
 -- Fix ResumeTimeout error message missing when node already has reason set.
 -- Avoid "running with local config" error when conf server is provided by DNS.
 -- openapi/v0.0.37 - resolve job user name when not sent by slurmctld.
 -- openapi/dbv0.0.37 - Correct OpenAPI specification for diag request.
 -- Ignore power_down request when node is already powering down.
 -- CVE-2022-29500 - Prevent credential abuse.
 -- CVE-2022-29501 - Prevent abuse of REQUEST_FORWARD_DATA.
 -- CVE-2022-29502 - Correctly validate io keys.



* Changes in Slurm 20.11.9
==
 -- burst_buffer - add missing common directory to the Makefile SUBDIRS.
 -- sacct - fix truncation when printing jobidraw field.
 -- GRES - Fix loading state of jobs using --gpus to request gpus.
 -- Fix minor logic error in health check node state output
 -- Fix GCC 11.1 compiler warnings.
 -- Delay steps when memory already used instead of rejecting step request.
 -- Fix memory leak in the slurmdbd when requesting wckeys from all clusters.
 -- Fix determining if a reservation is used or not.
 -- openapi/v0.0.35 - Honor kill_on_invalid_dependency as job parameter.
 -- openapi/v0.0.36 - Honor kill_on_invalid_dependency as job parameter.
 -- Fix various issues dealing with updates on magnetic reservations that could
lead to abort slurmctld.
 -- openapi/v0.0.36 - Avoid setting default values of min_cpus, job name, cwd,
mail_type, and contiguous on job update.
 -- openapi/v0.0.36 - Clear user hold on job update if hold=false.
 -- Fix slurmctld segfault due to a bit_test() call with a MAINT+ANY_NODES
reservation NULL node_bitmap.
 -- Fix slurmctld segfault due to a bit_copy() call with a REPLACE+ANY_NODES
reservation NULL node_bitmap.
 -- Fix error in GPU frequency validation logic.
 -- Fix error in pmix logic dealing with the incorrect size of buffer.
 -- PMIx v1.1.4 and below are no longer supported.
 -- Fix shutdown of slurmdbd plugin to correctly notice when the agent thread
finishes.
 -- Fix slurmctld segfault due to job array --batch features double free.
 -- CVE-2022-29500 - Prevent credential abuse.
 -- CVE-2022-29501 - Prevent abuse of REQUEST_FORWARD_DATA.




[slurm-users] Slurm version 21.08.7 is now available

2022-04-19 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 21.08.7.

This includes a number of minor to moderate severity fixes that have 
accumulated since the last maintenance release was made two months ago.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 21.08.7
==
 -- openapi/v0.0.37 - correct calculation for bf_queue_len_mean in /diag.
 -- Optimize sending down nodes in maintenance mode to the database when
removing reservations.
 -- Avoid shrinking a reservation when overlapping with downed nodes.
 -- Fix 'planned time' in rollups for jobs that were still pending when the
rollup happened.
 -- Prevent new elements from a job array from causing rerollups.
 -- Only check TRES limits against current usage for TRES requested by the job.
 -- Do not allocate shared gres (MPS) in whole-node allocations
 -- Fix minor memory leak when dealing with configless setups.
 -- Constrain slurmstepd to job/step cgroup like in previous versions of Slurm.
 -- Fix warnings on 32-bit compilers related to printf() formats.
 -- Fix memory leak when freeing kill_job_msg_t.
 -- Fix memory leak when using data_t.
 -- Fix reconfigure issues after disabling/reenabling the GANG PreemptMode.
 -- Fix race condition where a cgroup was being deleted while another step
was creating it.
 -- Set the slurmd port correctly if multi-slurmd
 -- openapi/v0.0.37 - Fix misspelling of account_gather_frequency in spec.
 -- openapi/v0.0.37 - Fix misspelling of cluster_constraint in spec.
 -- Fix FAIL mail not being sent if a job was cancelled due to preemption.
 -- slurmrestd - move debug logs for HTTP handling to be gated by debugflag
NETWORK to avoid unnecessary logging of communication contents.
 -- Fix issue with bad memory access when shrinking running steps.
 -- Fix various issues with internal job accounting with GRES when jobs are
shrunk.
 -- Fix ipmi polling on slurmd reconfig or restart.
 -- Fix srun crash when reserved ports are being used and het step fails
to launch.
 -- openapi/dbv0.0.37 - fix DELETE execution path on /user/{user_name}.
 -- slurmctld - Properly requeue all components of a het job if PrologSlurmctld
fails.
 -- rlimits - remove final calls to limit nofiles to 4096 but to instead use
the max possible nofiles in slurmd and slurmdbd.
 -- Fix slurmctld memory leak after a reconfigure with configless.
 -- Fix slurmd memory leak when fetching configless files.
 -- Allow the DBD agent to load large messages (up to MAX_BUF_SIZE) from state.
 -- Fix minor memory leak with cleaning up the extern step.
 -- Fix potential deadlock during slurmctld restart when there is a completing
job.
 -- slurmstepd - reduce user requested soft rlimits when they are above max
hard rlimits to avoid rlimit request being completely ignored and
processes using default limits.
 -- Fix memory leaks when job/step specifies a container.
 -- Fix Slurm user commands displaying available features as active features
when no features were active.
 -- Don't power down nodes that are rebooting.
 -- Clear pending node reboot on power down request.
 -- Ignore node registrations while node is powering down.
 -- Don't reboot any node that is power down.
 -- Don't allow a node to reboot if it's marked for power down.
 -- Fix issuing reboot and downing when rebooting a powering up node.
 -- Clear DRAIN on node after failing to resume before ResumeTimeout.
 -- Prevent repeating power down if node fails to resume before ResumeTimeout.
 -- Fix federated cloud node communication with srun and cloud_dns.
 -- Fix jobs being scheduled on nodes marked to be powered_down when idle.




[slurm-users] Slurm version 21.08.6 is now available

2022-02-24 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 21.08.6.

This includes a number of fixes since the last maintenance release was 
made in December, including an import fix to a regression seen when 
using the 'mpirun' command within a job script.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 21.08.6
==
 -- Handle typed shared GRES better in accounting.
 -- Fix plugin_name definitions in a number of plugins to improve logging.
 -- Close sbcast file transfers when job is cancelled.
 -- job_submit/lua - allow mail_type and mail_user fields to be modified.
 -- scrontab - fix handling of --gpus and --ntasks-per-gpu options.
 -- sched/backfill - fix job_queue_rec_t memory leak.
 -- Fix magnetic reservation logic in both main and backfill schedulers.
 -- job_container/tmpfs - fix memory leak when using InitScript.
 -- slurmrestd / openapi - fix memory leaks.
 -- Fix slurmctld segfault due to job array resv_list double free.
 -- Fix multi-reservation job testing logic.
 -- Fix slurmctld segfault due to insufficient job reservation parse validation.
 -- Fix main and backfill schedulers handling for already rejected job array.
 -- sched/backfill - restore resv_ptr after yielding locks.
 -- acct_gather_energy/xcc - appropriately close and destroy the IPMI context.
 -- Protect slurmstepd from making multiple calls to the cleanup logic.
 -- Prevent slurmstepd segfault at cleanup time in mpi_fini().
 -- Fix slurmctld sometimes hanging if shutdown while PrologSlurmctld or
EpilogSlurmctld were running and PrologEpilogTimeout is set in slurm.conf.
 -- Fix affinity of the batch step if batch host is different than the first
node in the allocation.
 -- slurmdbd - fix segfault after multiple failover/failback operations.
 -- Fix jobcomp filetxt job selection condition.
 -- Fix -f flag of sacct not being used.
 -- Select cores for job steps according to the socket distribution. Previously,
sockets were always filled before selecting cores from the next socket.
 -- Keep node in Future state if epilog completes while in Future state.
 -- Fix erroneous --constraint behavior by preventing multiple sets of brackets.
 -- Make ResetAccrueTime update the job's accrue_time to now.
 -- Fix sattach initialization with configless mode.
 -- Revert packing limit checks affecting pmi2.
 -- sacct - fixed assertion failure when using -c option and a federation
display
 -- Fix issue that allowed steps to overallocate the job's memory.
 -- Fix the sanity check mode of AutoDetect so that it actually works.
 -- Fix deallocated nodes that didn't actually launch a job from waiting for
Epilogslurmctld to complete before clearing completing node's state.
 -- Job should be in a completing state if EpilogSlurmctld when being requeued.
 -- Fix job not being requeued properly if all node epilog's completed before
EpilogSlurmctld finished.
 -- Keep job completing until EpilogSlurmctld is completed even when "downing"
a node.
 -- Fix handling reboot with multiple job features.
 -- Fix nodes getting powered down when creating new partitions.
 -- Fix bad bit_realloc which potentially could lead to bad memory access.
 -- slurmctld - remove limit on the number of open files.
 -- Fix bug where job_state file of size above 2GB wasn't saved without any
error message.
 -- Fix various issues with no_consume gres.
 -- Fix regression in 21.08.0rc1 where job steps failed to launch on systems
that reserved a CPU in a cgroup outside of Slurm (for example, on systems
with WekaIO).
 -- Fix OverTimeLimit not being reset on scontrol reconfigure when it is
removed from slurm.conf.
 -- serializer/yaml - use dynamic buffer to allow creation of YAML outputs
larger than 1MiB.
 -- Fix minor memory leak affecting openapi users at process termination.
 -- Fix batch jobs not resolving the username when nss_slurm is enabled.
 -- slurmrestd - Avoid slurmrestd ignoring invalid HTTP method if the response
serialized without error.
 -- openapi/dbv0.0.37 - Correct conditional that caused the diag output to
give an internal server error status on success.
 -- Make --mem-bind=sort work with task_affinity
 -- Fix sacctmgr to set MaxJobsAccruePer{User|Account} and MinPrioThres in
sacctmgr add qos, modify already worked correctly.
 -- job_container/tmpfs - avoid printing extraneous error messages in Prolog
and Epilog, and when the job completes.
 -- Fix step CPU memory allocation with --threads-per-core without --exact.
 -- Remove implicit --exact when --threads-per-core or --hint=nomultithread
is used.
 -- Do not allow a step to request more threads per core than the
allocation did.
 -- Remove implicit --exact when --cpus-per-task is used.




[slurm-users] Slurm version 21.08.5 is now available

2021-12-21 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 21.08.5.

This includes a number of moderate severity fixes since the last 
maintenance release a month ago.


And, as it appears to be _en vogue_ to discuss log4j issues, I'll take a 
moment to state that Slurm is unaffected by the recent log4j 
disclosures. Slurm is written in C, does not use log4j, and Slurm's 
logging subsystems are not vulnerable to the class of issues that have 
led to those exploits.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 21.08.5
==
 -- Fix issue where typeless GRES node updates were not immediately reflected.
 -- Fix setting the default scrontab job working directory so that it's the home
of the different user (-u ) and not that of root or SlurmUser editor.
 -- Fix stepd not respecting SlurmdSyslogDebug.
 -- Fix concurrency issue with squeue.
 -- Fix job start time not being reset after launch when job is packed onto
already booting node.
 -- Fix updating SLURM_NODE_ALIASES for jobs packed onto powering up nodes.
 -- Cray - Fix issues with starting hetjobs.
 -- auth/jwks - Print fatal() message when jwks is configured but file could
not be opened.
 -- If sacctmgr has an association with an unknown qos as the default qos
print 'UNKN-###' instead of leaving a blank name.
 -- Correctly determine task count when giving --cpus-per-gpu, --gpus and
--ntasks-per-node without task count.
 -- slurmctld - Fix places where the global last_job_update was not being set
to the time of update when a job's reason and description were updated.
 -- slurmctld - Fix case where a job submitted with more than one partition
would not have its reason updated while waiting to start.
 -- Fix memory leak in node feature rebooting.
 -- Fix time limit permanetly set to 1 minute by backfill for job array tasks
higher than the first with QOS NoReserve flag and PreemptMode configured.
 -- Fix sacct -N to show jobs that started in the current second
 -- Fix issue on running steps where both SLURM_NTASKS_PER_TRES and
SLURM_NTASKS_PER_GPU are set.
 -- Handle oversubscription request correctly when also requesting
--ntasks-per-tres.
 -- Correctly detect when a step requests bad gres inside an allocation.
 -- slurmstepd - Correct possible deadlock when UnkillableStepTimeout triggers.
 -- srun - use maximum number of open files while handling job I/O.
 -- Fix writing to Xauthority files on root_squash NFS exports, which was
preventing X11 forwarding from completing setup.
 -- Fix regression in 21.08.0rc1 that broke --gres=none.
 -- Fix srun --cpus-per-task and --threads-per-core not implicitly setting
--exact. It was meant to work this way in 21.08.
 -- Fix regression in 21.08.0 that broke dynamic future nodes.
 -- Fix dynamic future nodes remembering active state on restart.
 -- Fix powered down nodes getting stuck in COMPLETING+POWERED_DOWN when job is
cancelled before nodes are powering up.





[slurm-users] Slurm version 21.08.4 is now available (CVE-2021-43337)

2021-11-16 Thread Tim Wickberg
Slurm version 21.08.4 is now available, and includes a series of recent 
bug fixes, as well as a moderate security fix.


Note that this security issue is only present in the 21.08 release 
series. Slurm 20.11 and older releases are unaffected.


SchedMD customers were informed of this issue on November 2nd and 
provided a fix on request; this process is documented in our security 
policy. [1]


CVE-2021-43337:
For sites using the new AccountingStoreFlags=job_script and/or job_env
options, an issue was reported with the access control rules in SlurmDBD
that will permit users to request job scripts and environment files that
they should not have access to.

(Scripts/environments are meant to only be accessible by user accounts
with administrator privileges, by account coordinators for jobs
submitted under their account, and by the user themselves.)

Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

[1] https://www.schedmd.com/security.php

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 21.08.4
==
 -- Fix potential deadlock when using PMI v1.
 -- Fix tight loop sending DBD_SEND_MULT_JOB_START when the slurmctld has an
issue talking correctly to the DBD.
 -- Fix memory leak in step creation.
 -- Fix potential deadlock when shutting down slurmctld.
 -- Fix regression in 21.08 where multi-node steps that requested MemPerCPU
were not counted against the job's memory allocation on some nodes.
 -- Fix issue with select/cons_tres and the partition limit MaxCpusPerNode where
the limit was enforced for one less CPU than the configured value.
 -- jobacct_gather/common - compare Pss to Rss after scaling Pss to Rss units.
 -- Fix SLURM_NODE_ALIASES in RPC Prolog for batch jobs.
 -- Fix regression in 21.08 where slurmd and slurmstepd were not constrained
with CpuSpecList or CoreSpecCount.
 -- Fix cloud jobs running without powering up nodes after a reconfig/restart.
 -- CVE-2021-43337 - Fix security issue with new AccountingStoreFlags=job_script
and job_env options where users could request scripts and environments they
should not have been permitted to access.




[slurm-users] Slurm BoF and booth at SC21

2021-11-12 Thread Tim Wickberg
The Slurm Birds-of-a-Feather session will be held virtually on Thursday, 
November at 12:15 - 1:15pm (Central). This is conducted through the SC21 
HUBB platform, and you will need to have registered in some capacity 
through the conference to be able to participate live.


We'll be reviewing the Slurm 21.08 release, as well at a look at the 
roadmap for Slurm 22.05 and beyond. The remainder of the time will be 
reserved for live Q+A as we've traditionally done.


One note: SC21 has told us that they will not be recording any of the 
BoFs this year, and they will only be available live through their 
platform. However, SchedMD will be posting a recording of the Slurm BoF 
on our YouTube channel at a later point to ensure the broader community 
has access to it.


In addition to the BoF, there will be presentations in the Slurm booth - 
#1807 - over the course of the week. The tentative schedule is:


Tuesday:
11am - Introduction to Slurm
1pm - REST API
3pm - Google Cloud
5pm - Introduction to Slurm

Wednesday:
11am - Slurm in the Clouds
1pm - Introduction to Slurm
3pm - REST API
5pm - Introduction to Slurm

Thursday:
11am - Introduction to Slurm
1pm - Introduction to Slurm

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 21.08.3 is now available

2021-11-02 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 21.08.3.

This includes a number of fixes since the last release a month ago, 
including one critical fix to prevent a communication issue between 
slurmctld and slurmdbd for sites that have started using the new 
AccountingStoreFlags=job_script functionality.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 21.08.3
==
 -- Return error to sacctmgr when running 'sacctmgr archive load' and the load
fails due to an invalid or corrupted file.
 -- slurmctld/gres_ctld - fix deallocation of typed GRES without device.
 -- scrontab - fix capturing the cronspec request in the job script.
 -- openapi/dbv0.0.37 - Add missing method POST for /associations/.
 -- If ALTER TABLE was already run, continue with database upgrade.
 -- slurmstepd - Gracefully handle RunTimeQuery returning no output.
 -- srun - automatically handle issues with races to listen() on an ephemeral
socket, and suppress otherwise needless error messages.
 -- Schedule sooner after Epilog completion with SchedulerParameters=defer.
 -- Improve performance for AccountingStoreFlags=job_env.
 -- Expose missing SLURMD_NODENAME and SLURM_NODEID to TaskEpilog environment.
 -- Bring slurm_completion.sh up to date with changes to commands.
 -- Fix issue where burst buffer stage-in could only start for one job in a job
array per scheduling cycle instead of bb_array_stage_cnt jobs per scheduling
cycle.
 -- Fix checking if the dependency is the same job for array jobs.
 -- Fix checking for circular dependencies with job arrays.
 -- Restore dependent job pointers on slurmctld startup to avoid race.
 -- openapi/v0.0.37 - Allow strings for JobIds instead of only numerical JobIds
for GET, DELETE, and POST job methods.
 -- openapi/dbv0.0.36 - Gracefully handle missing associations.
 -- openapi/dbv0.0.36 - Avoid restricting job association lookups to only
default associations.
 -- openapi/dbv0.0.37 - Gracefully handle missing associations.
 -- openapi/dbv0.0.37 - Avoid restricting job association lookups to only
default associations.
 -- Fix error in GPU frequency validation logic.
 -- Fix regression in 21.08.1 that broke federated jobs.
 -- Correctly handle requested GRES when used in job arrays.
 -- Fix error in pmix logic dealing with the incorrect size of buffer.
 -- Fix handling of no_consume GRES, add it to allocated job allocated TRES.
 -- Fix issue with typed GRES without Files= (bitmap).
 -- Fix job_submit/lua support for 'gres' which is now stored as a 'tres'
when requesting jobs so needs a 'gres' prefix.
 -- Fix regression where MPS would not deallocate from the node properly.
 -- Fix --gpu-bind=verbose to work correctly.
 -- Do not deny --constraint with special operators "[]()|*" when no changeable
features are requested, but continue to deny --constraint with special
operators when changeable features are requested.
 -- openapi/v0.0.{35,36,37} - prevent merging the slurmrestd environment
alongside a new job submission.
 -- openapi/dbv0.0.36 - Correct tree position of dbv0.0.36_job_step.
 -- openapi/dbv0.0.37 - Correct tree position of dbv0.0.37_job_step.
 -- openapi/v0.0.37 - enable job priority field for job submissions and updates.
 -- openapi/v0.0.37 - request node states query includes MIXED state instead of
only allocated.
 -- mpi/pmix - avoid job hanging until the time limit on PMIx agent failures.
 -- Correct inverted logic where reduced version matching applied to non-SPANK
plugins where it should have only applied to SPANK plugins.
 -- Fix issues where prologs would run in serial without PrologFlags=serial.
 -- Make sure a job coming in is initially considered for magnetic reservations.
 -- PMIx v1.1.4 and below are no longer supported.
 -- Add comment to service files about disabling logging through journald.
 -- Add SLURM_NODE_ALIASES env to RPC Prolog (PrologFlags=alloc) environment.
 -- Limit max_script_size to 512 MB.
 -- Fix shutdown of slurmdbd plugin to correctly notice when the agent thread
finishes.
 -- slurmdbd - fix issue with larger batch script files being sent to SlurmDBD
with AccountingStoreFlags=job_script that can lead to accounting data loss
as the resulting RPC generated can exceed internal limits and won't be
sent, preventing further communication with SlurmDBD.
This issue is indicated by "error: Invalid msg_size" in your log files.
 -- Fix compile issue with --without-shared-libslurm.




[slurm-users] Slurm version 21.08.2 is now available

2021-10-05 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 21.08.2.

There is one significant change include in this maintenance release: the 
removal of support for the long-misunderstood TaskAffinity=yes option in 
cgroup.conf. Please consider using "TaskPlugins=cgroup,affinity" in 
slurm.conf as an option.


Unfortunately a number of issues identified where the processor affinity 
settings from this now-unsupported approach would be calculated 
incorrectly, leading to potential performance issues.


SchedMD had been previously planning to remove this support in the next 
22.05 release, but a number of issues reported after the cgroup code 
refactoring have led us to remove this now, rather than try to correct 
issues with what has not been a recommended configuration for some time.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 21.08.2
==
 -- slurmctld - fix how the max number of cores on a node in a partition are
calculated when the partition contains multi-socket nodes. This in turn
corrects certain jobs node count estimations displayed client-side.
 -- job_submit/cray_aries - fix "craynetwork" GRES specification after changes
introduced in 21.08.0rc1 that made TRES always have a type prefix.
 -- Ignore nonsensical check in the slurmd for [Pro|Epi]logSlurmctld.
 -- Fix writing to stderr/syslog when systemd runs slurmctld in the foreground.
 -- Fix locking around log level setting routines.
 -- Fix issue with updating job started with node range.
 -- Fix issue with nodes not clearing state in the database when the slurmctld
is started with clean-start.
 -- Fix hetjob components > 1 timing out due to InactiveLimit.
 -- Fix sprio printing -nan for normalized association priority if
PriorityWeightAssoc was not defined.
 -- Disallow FirstJobId=0.
 -- Preserve job start info in the database for a requeued job that hadn't
registered the first time in the database yet.
 -- Only send one message on prolog failure from the slurmd.
 -- Remove support for TaskAffinity=yes in cgroup.conf.
 -- accounting_storage/mysql - fix issue where querying jobs via sacct
--whole-hetjob=yes or slurmrestd (which automatically includes this flag)
could in some cases return more records than expected.
 -- Fix issue for preemption of job array task that makes afterok dependency
fail. Additionally, send emails when requeueing happens due to preemption.
 -- Fix sending requeue mail type.
 -- Properly resize a job's GRES bitmaps and counts when resizing the job.
 -- Fix node being able to transition to CLOUD state from non-cloud state.
 -- Fix regression introduced in 21.08.0rc1 which broke a step's ability to
inherit GRES from the job when the step didn't request GRES but the job did.
 -- Fix errors in logic when picking nodes based on bracketed anded constraints.
This also enforces the requirement to have a count when using such
constraints.
 -- Handle job resize better in the database.
 -- Exclude currently running, resized jobs from the runaway jobs list.
 -- Make it possible to shrink a job more than once.




[slurm-users] Slides and video from the SLUG'21 presentations are online

2021-09-22 Thread Tim Wickberg
The slides from SLUG'21 have now been uploaded to the Slurm Publication 
Archive:


https://slurm.schedmd.com/publications.html

The video recordings will remain[0] on our YouTube Channel for at least 
the next two weeks:


https://www.youtube.com/c/schedmd-slurm

As mentioned at the end of my presentation, we will have the Slurm 
Community booth at SC'21 in St. Louis. Although SchedMD will only have a 
skeleton crew on site.


The Slurm Birds-of-a-Feather session was approved for SC'21. We've 
requested this be held fully virtual but are waiting for confirmation, 
and will send further details on that as we get closer to the event.


And thank you to everyone who showed up during the live presentations 
yesterday. The presenters always appreciate the feedback and questions, 
even if it's not quite as interactive as our in-person meetings.


cheers,
- Tim

[0] Apologies to anyone who went looking for them yesterday - the live 
streams apparently take 12 hours before being listed on the 'Uploads' 
section which is the default view for the channel. The videos were 
available, but you needed to know to look for them on the SLUG'21 
playlist or use the direct links from the agenda to get to them.


--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



Re: [slurm-users] Slurm User Group Meeting (SLUG'21) will be held on YouTube on September 21st

2021-09-20 Thread Tim Wickberg
One last reminder: the Slurm User Group Meeting will be starting at 9am 
(Mountain) on Tuesday. Hope to (virtually) see you there!


- Tim

On 9/15/21 2:50 PM, Tim Wickberg wrote:
One more reminder that the Slurm User Group Meeting (SLUG'21) will be 
held on Tuesday, streaming through YouTube Live.


The agenda's been updated with the titles for each of the five sessions, 
and links have been added to the individual streams:


https://slurm.schedmd.com/slurm_ug_agenda.html

- Tim

On 8/31/21 2:26 PM, Tim Wickberg wrote:
The Slurm User Group Meeting (SLUG'21) this fall will be online once 
again. In lieu of an in-person meeting, SchedMD will broadcast a set 
of five presentations on Tuesday, September 21st, 2021, from 9am to 
noon (MDT) on our YouTube channel:

https://www.youtube.com/c/schedmd-slurm

There is no cost to attend, and there is no registration required.

Topics include: "Field Notes" (best practices / tips + tricks), 
Containers and updates to the REST API, the new burst_buffer/lua 
plugin and slurmscript, Slurm on Cloud, and an overview of the 20.11 
and 21.08 release as well as future roadmap.


I'll also be sending a few reminders out as we get closer to the event.




[slurm-users] Slurm version 21.08.1 is now available

2021-09-16 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 21.08.1.

For sites using scrontab, there is a critical fix included to ensure 
that the cron jobs continue to repeat indefinitely into the future.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 21.08.1
==
 -- Fix potential memory leak if a problem happens while allocating GRES for
a job.
 -- If an overallocation of GRES happens terminate the creation of a job.
 -- AutoDetect=nvml: Fatal if no devices found in MIG mode.
 -- slurm.spec - fix querying for PMIx and UCX version.
 -- Print federation and cluster sacctmgr error messages to stderr.
 -- Fix off by one error in --gpu-bind=mask_gpu.
 -- Fix statement condition in http_parser autoconf macro.
 -- Fix statement condition in netloc autoconf macro.
 -- Add --gpu-bind=none to disable gpu binding when using --gpus-per-task.
 -- Handle the burst buffer state "alloc-revoke" which previously would not
display in the job correctly.
 -- Fix issue in the slurmstepd SPANK prolog/epilog handler where configuration
values were used before being initialized.
 -- Restore a step's ability to utilize all of an allocations memory if --mem=0.
 -- Fix --cpu-bind=verbose garbage taskid.
 -- Fix cgroup task affinity issues from garbage taskid info.
 -- Make gres_job_state_validate() client logging behavior as before 44466a4641.
 -- Fix steps with --hint overriding an allocation with --threads-per-core.
 -- Require requesting a GPU if --mem-per-gpu is requested.
 -- Return error early if a job is requesting --ntasks-per-gpu and no gpus or
task count.
 -- Properly clear out pending step if unavailable to run with available
resources.
 -- Kill all processes spawned by burst_buffer.lua including decendents.
 -- openapi/v0.0.{35,36,37} - Avoid setting default values of min_cpus,
job name, cwd, mail_type, and contiguous on job update.
 -- openapi/v0.0.{35,36,37} - Clear user hold on job update if hold=false.
 -- Prevent CRON_JOB flag from being cleared when loading job state.
 -- sacctmgr - Fix deleting WCKeys when not specifying a cluster.
 -- Fix getting memory for a step when the first node in the step isn't the
first node in the allocation.
 -- Make SelectTypeParameters=CR_Core_Memory default for cons_tres and cons_res.
 -- Correctly handle mutex unlocks in the gres code if failures happen.
 -- Give better error message if -m plane is given with no size.
 -- Fix --distribution=arbitrary for salloc.
 -- Fix jobcomp/script regression introduced in 21.08.0rc1 0c75b9ac9d.
 -- Only send the batch node in the step_hostlist in the job credential.
 -- When setting affinity for the batch step don't assume the batch host is node
0.
 -- In task/affinity better checking for node existence when laying out
affinity.
 -- slurmrestd - fix job submission with auth/jwt.




Re: [slurm-users] Slurm User Group Meeting (SLUG'21) will be held on YouTube on September 21st

2021-09-15 Thread Tim Wickberg
One more reminder that the Slurm User Group Meeting (SLUG'21) will be 
held on Tuesday, streaming through YouTube Live.


The agenda's been updated with the titles for each of the five sessions, 
and links have been added to the individual streams:


https://slurm.schedmd.com/slurm_ug_agenda.html

- Tim

On 8/31/21 2:26 PM, Tim Wickberg wrote:
The Slurm User Group Meeting (SLUG'21) this fall will be online once 
again. In lieu of an in-person meeting, SchedMD will broadcast a set of 
five presentations on Tuesday, September 21st, 2021, from 9am to noon 
(MDT) on our YouTube channel:

https://www.youtube.com/c/schedmd-slurm

There is no cost to attend, and there is no registration required.

Topics include: "Field Notes" (best practices / tips + tricks), 
Containers and updates to the REST API, the new burst_buffer/lua plugin 
and slurmscript, Slurm on Cloud, and an overview of the 20.11 and 21.08 
release as well as future roadmap.


I'll also be sending a few reminders out as we get closer to the event.

Hope to (virtually) see you there!
- Tim





[slurm-users] Slurm User Group Meeting (SLUG'21) will be held on YouTube on September 21st

2021-08-31 Thread Tim Wickberg
The Slurm User Group Meeting (SLUG'21) this fall will be online once 
again. In lieu of an in-person meeting, SchedMD will broadcast a set of 
five presentations on Tuesday, September 21st, 2021, from 9am to noon 
(MDT) on our YouTube channel:

https://www.youtube.com/c/schedmd-slurm

There is no cost to attend, and there is no registration required.

Topics include: "Field Notes" (best practices / tips + tricks), 
Containers and updates to the REST API, the new burst_buffer/lua plugin 
and slurmscript, Slurm on Cloud, and an overview of the 20.11 and 21.08 
release as well as future roadmap.


I'll also be sending a few reminders out as we get closer to the event.

Hope to (virtually) see you there!
- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 21.08 is now available

2021-08-26 Thread Tim Wickberg
After 9 months of development and testing we are pleased to announce the 
availability of Slurm version 21.08!


Slurm 21.08 includes a number of new features including:

- A new "AccountingStoreFlags=job_script" option to store the job 
scripts directly in SlurmDBD.


- Added "sacct -o SubmitLine" format option to get the submit line of a 
job/step.


- Changes to the node state management so that nodes are marked as 
PLANNED instead of IDLE if the scheduler is still accumulating resources 
while waiting to launch a job on them.


- RS256 token support in auth/jwt.

- Overhaul of the cgroup subsystems to simplify operation, mitigate a 
number of inherent race conditions, and prepare for future cgroup v2 
support.


- Further improvements to cloud node power state management.

- A new child process of the Slurm controller called 'slurmscriptd' 
responsible for executing PrologSlurmctld and EpilogSlurmctld scripts, 
which significantly reduces performance issues associated with enabling 
those options.


- A new burst_buffer/lua plugin allowing for site-specific asynchronous 
job data management.


- Fixes to the job_container/tmpfs plugin to allow the slurmd process to 
be restarted while the job is running without issue.


- Added json/yaml output to sacct, squeue, and sinfo commands.

- Added a new node_features/helpers plugin to provide a generic way to 
change settings on a compute node across a reboot.


- Added support for automatically detecting and broadcasting shared 
libraries for an executable launched with 'srun --bcast'.


- Added initial OCI container execution support with a new --container 
option to sbatch and srun.


- Improved job step launch throughput.

- Improved "configless" support by allowing multiple control servers to 
be specified through the slurmd --conf-server option, and send 
additional configuration files at startup including cli_filter.lua.


Please see the RELEASE_NOTES distributed alongside the source for 
further details.


Thank you to all customers, partners, and community members who 
contributed to this release.


As with past releases, the documentation available at 
https://slurm.schedmd.com has been updated to the 21.08 release. Past 
versions are available in the archive. This release also marks the end 
of support for the 20.02 release. The 20.11 release will remain 
supported up until the 22.05 release next May, but will not see as 
frequent updates, and bug-fixes will be targeted for the 21.08 
maintenance releases going forward.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm release candidate version 21.08.0rc2 available for testing

2021-08-12 Thread Tim Wickberg
We are pleased to announce the availability of Slurm release candidate 
version 21.08.0rc2.


This is the second release candidate version of the upcoming 21.08 
release series, and corrects a number of issues identified with rc1.


If any issues are identified with this release candidate, please report 
them through https://bugs.schedmd.com against the 21.08.x version and we 
will address them before the first production 21.08.0 release is made.


Please note that the release candidates are not intended for production 
use. Barring any late-discovered issues, the state file formats should 
not change between now and 21.08.0 and are considered frozen at this 
time for the 21.08 release.


A preview of the updated documentation can be found at 
https://slurm.schedmd.com/archive/slurm-master/ .


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm release candidate version 21.08.0rc1 available for testing

2021-07-29 Thread Tim Wickberg
We are pleased to announce the availability of Slurm release candidate 
version 21.08.0rc1.


This is the first release candidate version of the upcoming 21.08 
release series, and represents the end of development for the release 
cycle, and a finalization of the RPC and state file formats.


If any issues are identified with this release candidate, please report 
them through https://bugs.schedmd.com against the 21.08.x version and we 
will address them before the first production 21.08.0 release is made.


Please note that the release candidates are not intended for production 
use. Barring any late-discovered issues, the state file formats should 
not change between now and 21.08.0 and are considered frozen at this 
time for the 21.08 release.


A preview of the updated documentation can be found at 
https://slurm.schedmd.com/archive/slurm-master/ .


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 20.11.8 is now available

2021-07-01 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 20.11.8.

This includes a number of minor-to-moderate severity bug fixes.

Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.11.8
==
 -- slurmctld - fix erroneous "StepId=CORRUPT" messages in error logs.
 -- Correct the error given when auth plugin fails to pack a credential.
 -- Fix unused-variable compiler warning on FreeBSD in fd_resolve_path().
 -- acct_gather_filesystem/lustre - only emit collection error once per step.
 -- srun - leave SLURM_DIST_UNKNOWN as default for --interactive.
 -- Add GRES environment variables (e.g., CUDA_VISIBLE_DEVICES) into the
interactive step, the same as is done for the batch step.
 -- Fix various potential deadlocks when altering objects in the database
dealing with every cluster in the database.
 -- slurmrestd - handle slurmdbd connection failures without segfaulting.
 -- slurmrestd - fix segfault for searches in slurmdb/v0.0.36/jobs.
 -- slurmrestd - remove (non-functioning) users query parameter for
slurmdb/v0.0.36/jobs from openapi.json
 -- slurmrestd - fix segfault in slurmrestd db/jobs with numeric queries
 -- slurmrestd - add argv handling for job/submit endpoint.
 -- srun - fix broken node step allocation in a heterogeneous allocation.
 -- Fail step creation if -n is not multiple of --ntasks-per-gpu.
 -- job_container/tmpfs - Fix slowdown on teardown.
 -- Fix problem with SlurmctldProlog where requeued jobs would never launch.
 -- job_container/tmpfs - Fix issue when restarting slurmd where the namespace
mount points could disappear.
 -- sacct - avoid truncating JobId at 34 characters.
 -- scancel - fix segfault when --wckey filtering option is used.
 -- select/cons_tres - Fix memory leak.
 -- Prevent file descriptor leak in job_container/tmpfs on slurmd restart.
 -- slurmrestd/dbv0.0.36 - Fix values dumped in job state/current and
job step state.
 -- slurmrestd/dbv0.0.36 - Correct description for previous state property.
 -- perlapi/libslurmdb - expose tres_req_str to job hash.
 -- scrontab - close and reopen temporary crontab file to deal with editors
that do not change the original file, but instead write out then rename
a new file.
 -- sstat - fix linking so that it will work when --without-shared-libslurm
was used to build Slurm.
 -- Clear allocated cpus for running steps in a job before handling requested
nodes on new step.
 -- Don't reject a step if not enough nodes are available. Instead, defer the
step until enough nodes are available to satisfy the request.
 -- Don't reject a step if it requests at least one specific node that is
already allocated to another step. Instead, defer the step until the
requested node(s) become available.
 -- slurmrestd - add description for slurmdb/job endpoint.
 -- Better handling of --mem=0.
 -- Ignore DefCpuPerGpu when --cpus-per-task given.
 -- sacct - fix segfault when printing StepId (or when using --long).





Re: [slurm-users] SLUG '21

2021-07-01 Thread Tim Wickberg

Unfortunately we will not be holding SLUG'21 in person.

We expect to have a virtual event again this year on Tuesday, September 
21st. I'll have more details as we get closer to that date.


- Tim

On 7/1/21 8:07 AM, Paul Brunk wrote:

Hi:

It's that time again...we're doing travel budget planning.  Do we have
a sense of whether or how there will be a user group meeting this
year?  I saw the April poll.

Thanks!

--
Grinning like an idiot,
Paul Brunk, system administrator
Georgia Advanced Computing Resource Center (GACRC)
Enterprise IT Svcs, the University of Georgia






[slurm-users] Slurm versions 20.11.7 and 20.02.7 are now available (CVE-2021-31215)

2021-05-12 Thread Tim Wickberg
Slurm versions 20.11.7 and 20.02.7 are now available, and include a 
series of recent bug fixes, as well as a critical security fix.


SchedMD customers were informed of this issue on April 28th and provided 
a fix on request; this process is documented in our security policy. [1]


CVE-2021-31215:
An issue was identified with environment handling within Slurm that can 
allow any user to run arbitrary commands as SlurmUser if the 
installation uses a PrologSlurmctld and/or EpilogSlurmctld script.


Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

[1] https://www.schedmd.com/security.php

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.11.7
==
 -- slurmd - handle configless failures gracefully instead of hanging
indefinitely.
 -- select/cons_tres - fix Dragonfly topology not selecting nodes in the same
leaf switch when it should as well as requests with --switches option.
 -- Fix issue where certain step requests wouldn't run if the first node in the
job allocation was full and there were idle resources on other nodes in
the job allocation.
 -- Fix deadlock issue with Slurmctld.
 -- torque/qstat - fix printf error message in output.
 -- When adding associations or wckeys avoid checking multiple times a user or
cluster name.
 -- Fix wrong jobacctgather information on a step on multiple nodes
due to timeouts sending its the information gathered on its node.
 -- Fix missing xstrdup which could result in slurmctld segfault on array jobs.
 -- Fix security issue in PrologSlurmctld and EpilogSlurmctld by always
prepending SPANK_ to all user-set environment variables. CVE-2021-31215.



* Changes in Slurm 20.02.7
==
 -- cons_tres - Fix DefCpuPerGPU
 -- select/cray_aries - Correctly remove jobs/steps from blades using NPC.
 -- Fix false positive oom-kill events on extern step termination when
jobacct_gather/cgroup configured.
 -- Ensure SPANK prolog and epilog run without an explicit PlugStackConfig.
 -- Fix missing xstrdup which could result in slurmctld segfault on array jobs.
 -- Fix security issue in PrologSlurmctld and EpilogSlurmctld by always
prepending SPANK_ to all user-set environment variables. CVE-2021-31215.




[slurm-users] Slurm version 20.11.6 is now available

2021-04-27 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 20.11.6.

This includes a number of minor-to-moderate severity fixes, as well as 
improvements to the recently added job_container/tmpfs plugin.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.11.6
==
 -- Fix sacct assert with the --qos option.
 -- Use pkg-config --atleast-version instead of --modversion for systemd.
 -- common/fd - fix getsockopt() call in fd_get_socket_error().
 -- Properly handle the return from fd_get_socket_error() in _conn_readable().
 -- cons_res - Fix issue where running jobs were not taken into consideration
when creating a reservation.
 -- Avoid a deadlock between job_list for_each and assoc QOS_LOCK.
 -- Fix TRESRunMins usage for partition qos on restart/reconfig.
 -- Fix printing of number of tasks on a completed job that didn't request
tasks.
 -- Fix updating GrpTRESRunMins when decrementing job time is bigger than it.
 -- Make it so we handle multithreaded allocations correctly when doing
--exclusive or --core-spec allocations.
 -- Fix incorrect round-up division in _pick_step_cores
 -- Use appropriate math to adjust cpu counts when --ntasks-per-core=1.
 -- cons_tres - Fix consideration of power downed nodes.
 -- cons_tres - Fix DefCpuPerGPU, increase cpus-per-task to match with
gpus-per-task * cpus-per-gpu.
 -- Fix under-cpu memory auto-adjustment when MaxMemPerCPU is set.
 -- Make it possible to override CR_CORE_DEFAULT_DIST_BLOCK.
 -- Perl API - fix retrieving/storing of slurm_step_id_t in job_step_info_t.
 -- Recover state of burst buffers when slurmctld is restarted to avoid skipping
burst buffer stages.
 -- Fix race condition in burst buffer plugin which caused a burst buffer
in stage-in to not get state saved if slurmctld stopped.
 -- auth/jwt - print an error if jwt_file= has not been set in slurmdbd.
 -- Fix RESV_DEL_HOLD not being a valid state when using squeue --states.
 -- Add missing squeue selectable states in valid states error message.
 -- Fix scheduling last array task multiple times on error, causing segfault.
 -- Fix issue where a step could be allocated more memory than the job when
dealing with --mem-per-cpu and --threads-per-core.
 -- Fix removing qos from assoc with -= can lead to assoc with no qos
 -- auth/jwt - fix segfault on invalid credential in slurmdbd due to
missing validate_slurm_user() function in context.
 -- Fix single Port= not being applied to range of nodes in slurm.conf
 -- Fix Jobs not requesting a tres are not starting because of that tres limit.
 -- acct_gather_energy/rapl - fix AveWatts calculation.
 -- job_container/tmpfs - Fix issues with cleanup and slurmd restarting on
running jobs.




Re: [slurm-users] Slurm version 20.11.5 is now available

2021-03-16 Thread Tim Wickberg
One errant backspace snuck into that announcement: the 
job_container.conf man page (with an 'r') serves as the initial 
documentation for this new job_container/tmpfs plugin. The link to the 
HTML version of the man page has been corrected in the text below:


On 3/16/21 4:16 PM, Tim Wickberg wrote:

We are pleased to announce the availability of Slurm version 20.11.5.

This includes a number of moderate severity bug fixes, alongside a new 
job_container/tmpfs plugin developed by NERSC that can be used to create 
per-job filesystem namespaces.


Initial documentation for this plugin is available at:
https://slurm.schedmd.com/job_container.conf.html
Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim





[slurm-users] Slurm version 20.11.5 is now available

2021-03-16 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 20.11.5.

This includes a number of moderate severity bug fixes, alongside a new 
job_container/tmpfs plugin developed by NERSC that can be used to create 
per-job filesystem namespaces.


Initial documentation for this plugin is available at:
https://slurm.schedmd.com/job_containe.conf.html
Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.11.5
== >  -- Fix main scheduler bug where bf_hetjob_prio truncates 

SchedulerParameters.

 -- Fix sacct not displaying UserCPU, SystemCPU and TotalCPU for large times.
 -- scrontab - fix to return the correct index for a bad #SCRON option.
 -- scrontab - fix memory leak when invalid option found in #SCRON line.
 -- Add errno for when a user requests multiple partitions and they are using
partition based associations.
 -- Fix issue where a job could run in a wrong partition when using
EnforcePartLimits=any and partition based associations.
 -- Remove possible deadlock when adding associations/wckeys in multiple
threads.
 -- When using PrologFlags=alloc make sure the correct Slurm version is set
in the credential.
 -- When sending a job a warning signal make sure we always send SIGCONT
beforehand.
 -- Fix issue where a batch job would continue running if a prolog failed on a
node that wasn't the batch host and requeuing was disabled.
 -- Fix issue where sometimes salloc/srun wouldn't get a message about a prolog
failure in the job's stdout.
 -- Requeue or kill job on a prolog failure when PrologFlags is not set.
 -- Fix race condition causing node reboots to get requeued before
ResumeTimeout expires.
 -- Preserve node boot_req_time on reconfigure.
 -- Preserve node power_save_req_time on reconfigure.
 -- Fix node reboots being queued and issued multiple times and preventing the
reboot to time out.
 -- Fix debug message related to GrpTRESRunMin (AssocGrpCPURunMinutesLimit).
 -- Fix run_command to exit correctly if track_script kills the calling thread.
 -- Only requeue a job when the PrologSlurmctld returns nonzero.
 -- When a job is signaled with SIGKILL make sure we flush all
prologs/setup scripts.
 -- Handle burst buffer scripts if the job is canceled while stage_in is
happening.
 -- When shutting down the slurmctld make note to ignore error message when
we have to kill a prolog/setup script we are tracking.
 -- scrontab - add support for the --open-mode option.
 -- acct_gather_profile/influxdb - avoid segfault on plugin shutdown if setup
has not completed successfully.
 -- Reduce delay in starting salloc allocations when running with prologs.
 -- Fix issue passing open fd's with [send|recv]msg.
 -- Alter AllocNodes check to work if the allocating node's domain doesn't
match the slurmctld's. This restores the pre-20.11 behavior.
 -- Fix slurmctld segfault if jobs from a prior version had the now-removed
INVALID_DEPEND state flag set and were allowed to run in 20.11.
 -- Add job_container/tmpfs plugin to give a method to provide a private /tmp
per job.
 -- Set the correct core affinity when using AutoDetect.
 -- Start relying on the conf again in xcpuinfo_mac_to_abs().
 -- Fix global_last_rollup assignment on job resizing.
 -- slurmrestd - hand over connection context on _on_message_complete().
 -- slurmrestd - mark "environment" as required for job submissions in schema.
 -- slurmrestd - Disable credential reuse on the same TCP connection. Pipelined
HTTP connections will have to provide authentication with every request.
 -- Avoid data conversion error on NULL strings in data_get_string_converted().
 -- Handle situation where slurmctld is too slow processing
REQUEST_COMPLETE_BATCH_SCRIPT and it gets resent from the slurmstepd.




[slurm-users] Slurm versions 20.11.4 is now available

2021-02-18 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 20.11.4.

This includes a workaround for a broken glibc version that erroneously 
prints a long-double value of 0 as "nan", which can corrupt Slurm's 
association state files.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.11.4
==
 -- Fix node selection for advanced reservations with features.
 -- mpi/pmix: Handle pipe failure better when using ucx.
 -- mpi/pmix: include PMIX_NODEID for each process entry.
 -- Fix job getting rejected after being requeued on same node that died.
 -- job_submit/lua - add "network" field.
 -- Fix situations when a reoccuring reservation could erroneously skip a
period.
 -- Ensure that a reservations [pro|epi]log are ran on reoccuring reservations.
 -- Fix threads-per-core memory allocation issue when using CR_CPU_MEMORY.
 -- Fix scheduling issue with --gpus.
 -- Fix gpu allocations that request --cpus-per-task.
 -- mpi/pmix: fixed print messages for all PMIXP_* macros
 -- Add mapping for XCPU to --signal option.
 -- Fix regression in 20.11 that prevented a full pass of the main scheduler
from ever executing.
 -- Work around a glibc bug in which "0" is incorrectly printed as "nan"
which will result in corrupted association state on restart.
 -- Fix regression in 20.11 which made slurmd incorrectly attempt to find the
parent slurmd address when not applicable and send incorrect reverse-tree
info to the slurmstepd.
 -- Fix cgroup ns detection when using containers (e.g. LXC or Docker).
 -- scrontab - change temporary file handling to work with emacs.




[slurm-users] Slurm version 20.11.3 is now available; reverts to older step launch semantics

2021-01-19 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 20.11.3.

This does include a major functional change to how job step launch is 
handled compared to the previous 20.11 releases. This affects srun as 
well as MPI stacks - such as Open MPI - which may use srun internally as 
part of the process launch.


One of the changes made in the Slurm 20.11 release was to the semantics 
for job steps launched through the 'srun' command. This also 
inadvertently impacts many MPI releases that use srun underneath their 
own mpiexec/mpirun command.


For 20.11.{0,1,2} releases, the default behavior for srun was changed 
such that each step was allocated exactly what was requested by the 
options given to srun, and did not have access to all resources assigned 
to the job on the node by default. This change was equivalent to Slurm 
setting the --exclusive option by default on all job steps. Job steps 
desiring all resources on the node needed to explicitly request them 
through the new '--whole' option.


In the 20.11.3 release, we have reverted to the 20.02 and older behavior 
of assigning all resources on a node to the job step by default.


This reversion is a major behavioral change which we would not generally 
do on a maintenance release, but is being done in the interest of 
restoring compatibility with the large number of existing Open MPI (and 
other MPI flavors) and job scripts that exist in production, and to 
remove what has proven to be a significant hurdle in moving to the new 
release.


Please note that one change to step launch remains - by default, in 
20.11 steps are no longer permitted to overlap on the resources they 
have been assigned. If that behavior is desired, all steps must 
explicitly opt-in through the newly added '--overlap' option.


Further details and a full explanation of the issue can be found at:
https://bugs.schedmd.com/show_bug.cgi?id=10383#c63

Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.11.3
==
 -- Fix segfault when parsing bad "#SBATCH hetjob" directive.
 -- Allow countless gpu: node GRES specifications in slurm.conf.
 -- PMIx - Don't set UCX_MEM_MMAP_RELOC for older version of UCX (pre 1.5).
 -- Don't green-light any GPU validation when core conversion fails.
 -- Allow updates to a reservation in the database that starts in the future.
 -- Better check/handling of primary key collision in reservation table.
 -- Improve reported error and logging in _build_node_list().
 -- Fix uninitialized variable in _rpc_file_bcast() which could lead to an
incorrect error return from sbcast / srun --bcast.
 -- mpi/cray_shasta - fix use-after-free on error in _multi_prog_parse().
 -- Cray - Handle setting correct prefix for cpuset cgroup with respects to
expected_usage_in_bytes.  This fixes Cray's OOM killer.
 -- mpi/pmix: Fix PMIx_Abort support.
 -- Don't reject jobs allocating more cores than tasks with MaxMemPerCPU.
 -- Fix false error message complaining about oversubscribe in cons_tres.
 -- scrontab - fix parsing of empty lines.
 -- Fix regression causing spank_process_option errors to be ignored.
 -- Avoid making multiple interactive steps.
 -- Fix corner case issues where step creation should fail.
 -- Fix job rejection when --gres is less than --gpus.
 -- Fix regression causing spank prolog/epilog not to be called unless the
spank plugin was loaded in slurmd context.
 -- Fix regression preventing SLURM_HINT=nomultithread from being used
to set defaults for salloc->srun, sbatch->srun sequence.
 -- Reject job credential if non-superuser sets the LAUNCH_NO_ALLOC flag.
 -- Make it so srun --no-allocate works again.
 -- jobacct_gather/linux - Don't count memory on tasks that have already
finished.
 -- Fix 19.05/20.02 batch steps talking with a 20.11 slurmctld.
 -- jobacct_gather/common - Do not process jobacct's with same taskid when
calling prec_extra.
 -- Cleanup all tracked jobacct tasks when extern step child process finishes.
 -- slurmrestd/dbv0.0.36 - Correct structure of dbv0.0.36_tres_list.
 -- Fix regression causing task/affinity and task/cgroup to be out of sync when
configured ThreadsPerCore is different than the physical threads per core.
 -- Fix situation when --gpus is given but not max nodes (-N1-1) in a job
allocation.
 -- Interactive step - ignore cpu bind and mem bind options, and do not set
the associated environment variables which lead to unexpected behavior
from srun commands launched within the interactive step.
 -- Handle exit code from pipe when using UCX with PMIx.




[slurm-users] Upcoming Slurm 20.11.3 release will revert to older step launch semantics

2021-01-08 Thread Tim Wickberg

Hey folks -

As some of you have observed, one of the changes made in the Slurm 20.11 
release was to the semantics for job steps launched through the 'srun' 
command. This also inadvertently impacts many MPI releases that use srun 
underneath their own mpiexec/mpirun command.


For 20.11.{0,1,2} releases, the default behavior for srun was changed to 
limiting the step to only exactly what was requested by the options 
given to srun. This change was equivalent to Slurm setting the 
--exclusive option by default on all job steps. Job steps desiring all 
resources on the node needed to explicitly request them through the new 
'--whole' option.


In the upcoming 20.11.3 release, we will be reverting to the 20.02 and 
older behavior of assigning all resources on a node to the job step by 
default.


This is a major behavioral change, and not one we're making lightly, but 
is being done in the interest of restoring compatibility with the large 
number of existing Open MPI (and other MPI flavors) and job scripts that 
exist in production, and to remove what has proven to be a significant 
hurdle in moving to the new release.


Please note that one change to step launch remains - by default, in 
20.11 steps are no longer permitted to overlap on the resources they 
have been assigned. If that behavior is desired, all steps must 
explicitly opt-in through the newly added '--overlap' option.


Further details and a full explanation of the issue can be found at:
https://bugs.schedmd.com/show_bug.cgi?id=10383#c63

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm versions 20.11.2 is now available

2020-12-18 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 20.11.2.

This resolves a critical regression from the recent 20.11.1 release 
which prevented both PMI and PMIx interfaces from functioning correctly.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.11.2
==
 -- Fix older versions of sacct not working with 20.11.
 -- Fix slurmctld crash when using a pre-20.11 srun in a job allocation.
 -- Correct logic problem in _validate_user_access.
 -- Fix libpmi to initialize Slurm configuration correctly.




[slurm-users] Slurm versions 20.11.1 is now available

2020-12-10 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 20.11.1.

This includes a number of fixes made in the month since 20.11 was 
initially released, including critical fixes to nss_slurm and the Perl 
API when used with the newer configless mode of operation.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.11.1
==
 -- Fix spelling of "overcomited" to "overcomitted" in sreport's cluster
utilization report.
 -- Silence debug message about shutting down backup controllers if none are
configured.
 -- Don't create interactive srun until PrologSlurmctld is done.
 -- Fix fd symlink path resolution.
 -- Fix slurmctld segfault on subnode reservation restore after node
configuration change.
 -- Fix resource allocation response message environment allocation size.
 -- Ensure that details->env_sup is NULL terminated.
 -- select/cray_aries - Correctly remove jobs/steps from blades using NPC.
 -- cons_tres - Avoid max_node_gres when entire node is allocated with
--ntasks-per-gpu.
 -- Allow NULL arg to data_get_type().
 -- In sreport have usage for a reservation contain all jobs that ran in the
reservation instead of just the ones that ran in the time specified. This
matches the report for the reservation is not truncated for a time period.
 -- Fix issue with sending wrong batch step id to a < 20.11 slurmd.
 -- Add a job's alloc_node to lua for job modification and completion.
 -- Fix regression getting a slurmdbd connection through the perl API.
 -- Stop the extern step terminate monitor right after proctrack_g_wait().
 -- Fix removing the normalized priority of assocs.
 -- slurmrestd/v0.0.36 - Use correct name for partition field:
"min nodes per job" -> "min_nodes_per_job".
 -- slurmrestd/v0.0.36 - Add node comment field.
 -- Fix regression marking cloud nodes as "unexpectedly rebooted" after
multiple boots.
 -- Fix slurmctld segfault in _slurm_rpc_job_step_create().
 -- slurmrestd/v0.0.36 - Filter node states against NODE_STATE_BASE to avoid
the extended states all being reported as "invalid".
 -- Fix race that can prevent the prolog for a requeued job from running.
 -- cli_filter - add "type" to readily distinguish between the CLI command in
use.
 -- smail - reduce sleep before seff to 5 seconds.
 -- Ensure SPANK prolog and epilog run without an explicit PlugStackConfig.
 -- Disable MySQL automatic reconnection.
 -- Fix allowing "b" after memory unit suffixes.
 -- Fix slurmctld segfault with reservations without licenses.
 -- Due to internal restructuring ahead of the 20.11 release, applications
calling libslurm MUST call slurm_init(NULL) before any API calls.
Otherwise the API call is likely to fail due to libslurm's internal
configuration not being available.
 -- slurm.spec - allow custom paths for PMIx and UCX install locations.
 -- Use rpath if enabled when testing for Mellanox's UCX libraries.
 -- slurmrestd/dbv0.0.36 - Change user query for associations to optional.
 -- slurmrestd/dbv0.0.36 - Change account query for associations to optional.
 -- mpi/pmix - change the error handler error message to be more useful.
 -- Add missing connection in acct_storage_p_{clear_stats, reconfig, shutdown}.
 -- Perl API - fix issue when running in configless mode.
 -- nss_slurm - avoid deadlock when stray sockets are found.
 -- Display correct value for ScronParameters in 'scontrol show config'.




[slurm-users] Slurm SC20 Birds-of-a-Feather presentation online

2020-11-30 Thread Tim Wickberg
The roadmap presentation from the SC20 Birds-of-a-Feather session is 
online now:


https://slurm.schedmd.com/SC20/BoF.pdf

There is also a recording of the BoF including the Q+A session with Tim 
and Danny that will remain available through the SC20 virtual platform 
for the next few months. Please note you will need to register to get 
access to this, but the free 'virtual exhibits' status should be sufficient.


https://www.eventscribe.net/2020/SC20/index.asp?presTarget=1473789

(Click the "Video" link on the page to see the recording.)

SC20 registration page:
https://sc20.supercomputing.org/attend/register/

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 20.11.0 is now available

2020-11-17 Thread Tim Wickberg
After 9 months of development and testing we are pleased to announce the 
availability of Slurm version 20.11.0!


Slurm 20.11 includes a number of new features including:

- Overhaul of the job step management and launch code, alongside 
improved GPU task placement support.


- A new "Interactive Step" mode of operation for salloc.

- A new "scrontab" command that can be used to submit and manage 
periodically repeating jobs.


- IPv6 support.

- Changes to the reservation logic, with new options allowing users to 
delete reservations, allowing admins to skip the next occurance of a 
repeated reservation, and allowing for a job to be submitted and 
eligible to run within multiple reservations.


- Dynamic Future Nodes - automatically associate a dynamically 
provisioned (or "cloud") node against a NodeName definition with 
matching hardware.


- An experimental new RPC queuing mode for slurmctld to reduce thread 
contention on heavily loaded clusters.


- SlurmDBD integration with the Slurm REST API.

Please see the RELEASE_NOTES distributed alongside the source for 
further details.


Thank you to all customers, partners, and community members who 
contributed to this release.


As with past releases, the documentation available at 
https://slurm.schedmd.com has been updated to the 20.11 release. Past 
versions are available in the archive. This release also marks the end 
of support for the 19.05 release. The 20.02 release will remain 
supported up until the 21.08 release next August, but will not see as 
frequent updates, and bug-fixes will be targeted for the 20.11 
maintenance releases going forward.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm versions 20.02.6 and 19.05.8 are now available (CVE-2020-27745 and CVE-2020-27746)

2020-11-12 Thread Tim Wickberg
Slurm versions 20.11.0rc2, 20.02.6 and 19.05.8 are now available, and 
include a series of recent bug fixes, as well as a fix for two security 
issues.


Note: the 19.05 release series is nearing the end of it's support 
lifecycle as we prepare to release 20.11 later this month. The 19.05.8 
download link is under the 'Older Versions' page.


SchedMD customers were informed on October 29th and provided patches on 
request; this process is documented in our security policy [1].


CVE-2020-27745:
A review of Slurm's RPC handling code uncovered a potential buffer 
overflow with one utility function. The only affected use is in Slurm's 
PMIx MPI plugin, and a job would only be vulnerable if --mpi=pmix was 
requested, or the site has set MpiDefault=pmix in slurm.conf.


CVE-2020-27746:
Slurm's use of the 'xauth' command to manage X11 magic cookies can lead 
to an inadvertent disclosure of a user's cookie when setting up X11 
forwarding on a node. An attacker monitoring /proc on the node could 
race the setup and steal the magic cookie, which may let them connect to 
that user's X11 session. A job would only be impacted if --x11 was 
requested at submission time. This was reported by Jonas Stare (NSC).


Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

[1] https://www.schedmd.com/security.php

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.11.0rc2
==
 -- MySQL - Remove potential race condition when sending updates to a cluster
and commit_delay used.
 -- Fixed regression in rc1 where sinfo et al would not show a node in a resv
state.
 -- select/linear will now allocate up to nodes RealMemory when configured with
SelectTypeParameters=CR_Memory and --mem=0 specified. Previous behavior was
no memory accouted and no memory limits implied to job.
 -- Remove unneeded lock check from running the slurmctld prolog for a job.
 -- Fix duplicate key error on clean starts after slurmctld is killed.
 -- Avoid double free of step_record_t in the slurmctld when node is removed
from config.
 -- Zero out step_record_t's magic when freed.
 -- Fix sacctmgr clearing QosLevel when trailing comma is used.
 -- slurmrestd - fix a fatal() error when connecting over IPv6.
 -- slurmrestd - add API to interface with slurmdbd.
 -- mpi/cray_shasta - fix PMI port parsing for non-contiguous port ranges.
 -- squeue and sinfo -O no longer repeat the last suffix specified.
 -- cons_tres - fix regression regarding gpus with --cpus-per-task.
 -- Avoid non-async-signal-safe functions calls in X11 fowarding which can
lead to the extern step terminating unexpectedly.
 -- Don't send job completion email for revoked federation jobs.
 -- Fix device or resource busy errors on cgroup cleanup on older kernels.
 -- Avoid binding to IPv6 wildcard address in slurmd if IPv6 is not explicitly
enabled.
 -- Make ntasks_per_gres work with cpus_per_task.
 -- Various alterations in reference to ntasks_per_tres.
 -- slurmrestd - multiple changes to make Slurm's OpenAPI spec compatible with
https://openapi-generator.tech/.
 -- nss_slurm - avoid loading slurm.conf to avoid issues on configless systems,
or systems with config files loaded on shared storage.
 -- scrontab - add cli_filter hooks.
 -- job_submit/lua - expose a "cron_job" flag to identify jobs submitted
through scrontab.
 -- PMIx - fix potential buffer overflows from use of unpackmem().
CVE-2020-27745.
 -- X11 forwarding - fix potential leak of the magic cookie when sent as an
argument to the xauth command. CVE-2020-27746.



* Changes in Slurm 20.02.6
==
 -- Fix sbcast --fanout option.
 -- Tighten up keyword matching for --dependency.
 -- Fix "squeue -S P" not sorting by partition name.
 -- Fix segfault in slurmctld if group resolution fails during job credential
creation.
 -- sacctmgr - Honor PreserveCaseUser when creating users with load command.
 -- Avoid attempting to schedule jobs on magnetic reservations when they aren't
allowed.
 -- Always make sure we clear the magnetic flag from a job.
 -- In backfill avoid NULL pointer dereference.
 -- Fix Segfault at end of slurmctld if you have a magnetic reservation and
you shutdown the slurmctld.
 -- Silence security warning when a Slurm is trying a job for a
magnetic reservation.
 -- Have sacct exit correctly when a user/group id isn't valid.
 -- Remove extra \n from invalid user/group id error message.
 -- Detect when extern steps trigger OOM events and mark extern step correctly.
 -- pam_slurm_adopt - permit root access to the node before reading the config
file, which will give root a chance to fix the config if missing or broken.
 -- Reset DefMemPerCPU, MaxMemPerCPU, and TaskPluginParam (among other minor
flags) on reconfigure.
 -- Fix incorrect memory handling of mail_user when upd

[slurm-users] Slurm release candidate version 20.11.0rc1 available for testing

2020-11-03 Thread Tim Wickberg
We are pleased to announce the availability of Slurm release candidate 
version 20.11.0rc1.


Slurm 20.11 includes a number of new features including:

- Overhaul of the job step management and launch code, alongside 
improved GPU task placement support.


- A new "Interactive Step" mode of operation for salloc.

- A new "scrontab" command that can be used to submit and manage 
periodically repeating jobs.


- IPv6 support.

- Changes to the reservation logic, with new options allowing users to 
delete reservations, allowing admins to skip the next occurance of a 
repeated reservation, and allowing for a job to be submitted and 
eligible to run within multiple reservations.


- Dynamic Future Nodes - automatically associate a dynamically 
provisioned (or "cloud") node against a NodeName definition with 
matching hardware.


- An experimental new RPC queuing mode for slurmctld to reduce thread 
contention on heavily loaded clusters.


Please see the RELEASE_NOTES distributed alongside the source for 
further details.


This is the first release candidate version of the upcoming 20.11 
release series, and represents the end of development for the release 
cycle, and a finalization of the RPC and state file formats.


If any issues are identified with this new release candidate, please 
report them through https://bugs.schedmd.com against the 20.11.x version 
and we will address them before the first production 20.11.0 release is 
made.


Please note that the release candidates are not intended for production 
use. Barring any late-discovered issues, the state file formats should 
not change between now and 20.11.0 and are considered frozen at this 
time for the 20.11 release.


A preview of the updated documentation can be found at 
https://slurm.schedmd.com/archive/slurm-master/ .


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



Re: [slurm-users] Slurm User Group Meeting (SLUG'20) Agenda Posted

2020-09-15 Thread Tim Wickberg
PDFs of the slides from the presentations are posted to the publication 
archive now:


https://slurm.schedmd.com/publications.html

The videos for today's presentations will remain up through next Monday 
if you did not get a chance to tune in live:

https://slurm.schedmd.com/slurm_ug_agenda.html

Thanks to everyone for tuning in and participating, and we hope to see 
you next year at SLUG'21, hopefully in person. We will hopefully have 
details to share about that next spring.


As one last note, the Slurm Community Birds-of-a-Feather session was 
accepted for the virtual SC20 conference. Further details will be posted 
closer to SC20 in November.


- Tim

On 9/10/20 5:16 PM, Tim Wickberg wrote:
The agenda's been updated with links to the presentation broadcasts 
which will go live on Tuesday morning:


https://slurm.schedmd.com/slurm_ug_agenda.html

No registration is required.

Note that each presentation is a separate stream, and you will need to 
switch between them at the break. Please bear with us in the event of 
any technical difficulties; this is our first foray into a livestreamed 
event, and despite our testing, there may be some unexpected hiccups.


We do encourage questions during the presentations (through the YouTube 
Live chat feature), although SchedMD will be filtering those before 
relaying them to the presenter. (And there is a ~5 second broadcast 
delay on top of that.)


The presentation videos will remain up for at least one week after 
virtual SLUG'20 concludes.


- Tim




Re: [slurm-users] Slurm User Group Meeting (SLUG'20) Agenda Posted

2020-09-15 Thread Tim Wickberg
One last reminder that SLUG'20 be be starting in a half hour. Looking 
forward to (virtually) seeing you there!


cheers,
- Tim

On 9/10/20 5:16 PM, Tim Wickberg wrote:
The agenda's been updated with links to the presentation broadcasts 
which will go live on Tuesday morning:


https://slurm.schedmd.com/slurm_ug_agenda.html

No registration is required.

Note that each presentation is a separate stream, and you will need to 
switch between them at the break. Please bear with us in the event of 
any technical difficulties; this is our first foray into a livestreamed 
event, and despite our testing, there may be some unexpected hiccups.


We do encourage questions during the presentations (through the YouTube 
Live chat feature), although SchedMD will be filtering those before 
relaying them to the presenter. (And there is a ~5 second broadcast 
delay on top of that.)


The presentation videos will remain up for at least one week after 
virtual SLUG'20 concludes.


- Tim




Re: [slurm-users] Slurm User Group Meeting (SLUG'20) Agenda Posted

2020-09-10 Thread Tim Wickberg
The agenda's been updated with links to the presentation broadcasts 
which will go live on Tuesday morning:


https://slurm.schedmd.com/slurm_ug_agenda.html

No registration is required.

Note that each presentation is a separate stream, and you will need to 
switch between them at the break. Please bear with us in the event of 
any technical difficulties; this is our first foray into a livestreamed 
event, and despite our testing, there may be some unexpected hiccups.


We do encourage questions during the presentations (through the YouTube 
Live chat feature), although SchedMD will be filtering those before 
relaying them to the presenter. (And there is a ~5 second broadcast 
delay on top of that.)


The presentation videos will remain up for at least one week after 
virtual SLUG'20 concludes.


- Tim



[slurm-users] Slurm versions 20.02.5 is now available

2020-09-10 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 20.02.5.

This includes an extended set of fixes of varying severity since the 
last maintenance release was made a month ago.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.02.5
==
 -- Fix leak of TRESRunMins when job time is changed with --time-min
 -- pam_slurm - explicitly initialize slurm config to support configless mode.
 -- scontrol - Fix exit code when creating/updating reservations with wrong
Flags.
 -- When a GRES has a no_consume flag, report 0 for allocated.
 -- Fix cgroup cleanup by jobacct_gather/cgroup.
 -- When creating reservations/jobs don't allow counts on a feature unless
using an XOR.
 -- Improve number of boards discovery
 -- Fix updating a reservation NodeCnt on a zero-count reservation.
 -- slurmrestd - provide an explicit error messages when PSK auth fails.
 -- cons_tres - fix job requesting single gres per-node getting two or more
nodes with less CPUs than requested per-task.
 -- cons_tres - fix calculation of cores when using gres and cpus-per-task.
 -- cons_tres - fix job not getting access to socket without GPU or with less
than --gpus-per-socket when not enough cpus available on required socket
and not using --gres-flags=enforce binding.
 -- Fix HDF5 type version build error.
 -- Fix creation of CoreCnt only reservations when the first node isn't
available.
 -- Fix wrong DBD Agent queue size in sdiag when using accounting_storage/none.
 -- Improve job constraints XOR option logic.
 -- Fix preemption of hetjobs when needed nodes not in leader component.
 -- Fix wrong bit_or() messing potential preemptor jobs node bitmap, causing
bad node deallocations and even allocation of nodes from other partitions.
 -- Fix double-deallocation of preempted non-leader hetjob components.
 -- slurmdbd - prevent truncation of the step nodelists over 4095.
 -- Fix nodes remaining in drain state state after rebooting with ASAP option.




Re: [slurm-users] Slurm User Group Meeting (SLUG'20) Agenda Posted

2020-09-06 Thread Tim Wickberg

Nope, no registration required.

On 9/3/20 10:05 PM, Jacqueline Scoggins wrote:

Do we have to register to join?

Jackie

On Mon, Aug 31, 2020 at 3:54 PM Tim Wickberg <mailto:t...@schedmd.com>> wrote:


We're still nailing down a few details with the streaming platform (and
will add them to the website when resolved), but do expect to have the
video available for one or two weeks afterwards.

- Tim

On 8/31/20 7:07 AM, Ole Holm Nielsen wrote:
 > On 8/28/20 10:45 PM, Tim Wickberg wrote:
 >> The Slurm User Group Meeting (SLUG'20) this fall will be moving
 >> online. In lieu of an in-person meeting, SchedMD will broadcast a
 >> select set of presentations on Tuesday, September 15th, 2020,
from 9am
 >> to noon (MDT).
 >>
 >> The agenda is now posted online at:
 >> https://slurm.schedmd.com/slurm_ug_agenda.html
 >>
 >> Links to the broadcasts will be added there when available, and an
 >> update will be sent to slurm-announce and slurm-users lists.
 >
 > The broadcast timing is a bit awkward for European customers due
to the
 > 8 hour time difference.  I will most likely need to view the
 > presentations later on.  Can the broadcasts be made available for
 > viewing later on?
 >
 > Thanks,
 > Ole
 >





Re: [slurm-users] Slurm User Group Meeting (SLUG'20) Agenda Posted

2020-08-31 Thread Tim Wickberg
We're still nailing down a few details with the streaming platform (and 
will add them to the website when resolved), but do expect to have the 
video available for one or two weeks afterwards.


- Tim

On 8/31/20 7:07 AM, Ole Holm Nielsen wrote:

On 8/28/20 10:45 PM, Tim Wickberg wrote:
The Slurm User Group Meeting (SLUG'20) this fall will be moving 
online. In lieu of an in-person meeting, SchedMD will broadcast a 
select set of presentations on Tuesday, September 15th, 2020, from 9am 
to noon (MDT).


The agenda is now posted online at:
https://slurm.schedmd.com/slurm_ug_agenda.html

Links to the broadcasts will be added there when available, and an 
update will be sent to slurm-announce and slurm-users lists.


The broadcast timing is a bit awkward for European customers due to the 
8 hour time difference.  I will most likely need to view the 
presentations later on.  Can the broadcasts be made available for 
viewing later on?


Thanks,
Ole





[slurm-users] Slurm User Group Meeting (SLUG'20) Agenda Posted

2020-08-28 Thread Tim Wickberg
The Slurm User Group Meeting (SLUG'20) this fall will be moving online. 
In lieu of an in-person meeting, SchedMD will broadcast a select set of 
presentations on Tuesday, September 15th, 2020, from 9am to noon (MDT).


The agenda is now posted online at:
https://slurm.schedmd.com/slurm_ug_agenda.html

Links to the broadcasts will be added there when available, and an 
update will be sent to slurm-announce and slurm-users lists.


- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm versions 20.02.4 is now available

2020-08-05 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 20.02.4.

This includes an extended set of fixes of varying severity since the 
last maintenance release was made more than two months ago.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.02.4
==
 -- srun - suppress job step creation warning message when waiting on
PrologSlurmctld.
 -- slurmrestd - fix incorrect return values in data_list_for_each() functions.
 -- mpi/pmix - fix issue where HetJobs could fail to launch.
 -- slurmrestd - set content-type header in responses.
 -- Fix cons_res GRES overallocation for --gres-flags=disable-binding.
 -- Fix cons_res incorrectly filtering cores with respect to GRES locality for
--gres-flags=disable-binding requests.
 -- Fix regression where a dependency on multiple jobs in a single array using
underscores would only add the first job.
 -- slurmrestd - fix corrupted output due to incorrect use of memcpy().
 -- slurmrestd - address a number of minor Coverity warnings.
 -- Handle retry failure when slurmstepd is communicating with srun correctly.
 -- Fix jobacct_gather possibly duplicate stats when _is_a_lwp error shows up.
 -- Fix tasks binding to GRES which are closest to the allocated CPUs.
 -- Fix AMD GPU ROCM 3.5 support.
 -- Fix handling of job arrays in sacct when querying specific steps.
 -- slurmrestd - avoid fallback to local socket authentication if JWT
authentication is ill-formed.
 -- slurmrestd - restrict ability of requests to use different authentication
plugins.
 -- slurmrestd - unlink named unix sockets before closing.
 -- slurmrestd - fix invalid formatting in openapi.json.
 -- Fix batch jobs stuck in CF state on FrontEnd mode.
 -- Add a separate explicit error message when rejecting changes to active node
features.
 -- cons_common/job_test - fix slurmctld SIGABRT due to double-free.
 -- Fix updating reservations to set the duration correctly if updating the
start time.
 -- Fix update reservation to promiscuous mode.
 -- Fix override of job tasks count to max when ntasks-per-node present.
 -- Fix min CPUs per node not being at least CPUs per task requested.
 -- Fix CPUs allocated to match CPUs requested when requesting GRES and
threads per core equal to one.
 -- Fix NodeName config parsing with Boards and without CPUs.
 -- Ensure SLURM_JOB_USER and SLURM_JOB_UID are set in SrunProlog/Epilog.
 -- Fix error messages for certain invalid salloc/sbatch/srun options.
 -- pmi2 - clean up sockets at step termination.
 -- Fix 'scontrol hold' to work with 'JobName'.
 -- sbatch - handle --uid/--gid in #SBATCH directives properly.
 -- Fix race condition in job termination on slurmd.
 -- Print specific error messages if trying to run use certain
priority/multifactor factors that cannot work without SlurmDBD.
 -- Avoid partial GRES allocation when --gpus-per-job is not satisfied.
 -- Cray - Avoid referencing a variable outside of it's correct scope when
dealing with creating steps within a het job.
 -- slurmrestd - correctly handle larger addresses from accept().
 -- Avoid freeing wrong pointer with SlurmctldParameters=max_dbd_msg_action
with another option after that.
 -- Restore MCS label when suspended job is resumed.
 -- Fix insufficient lock levels.
 -- slurmrestd - use errno from job submission.
 -- Fix "user" filter for sacctmgr show transactions.
 -- Fix preemption logic.
 -- Fix no_consume GRES for exclusive (whole node) requests.
 -- Fix regression in 20.02 that caused an infinite loop in slurmctld when
requesting --distribution=plane for the job.
 -- Fix parsing of the --distribution option.
 -- Add CONF READ_LOCK to _handle_fed_send_job_sync.
 -- prep/script - always call slurmctld PrEp callback in _run_script().
 -- Fix node estimation for jobs that use GPUs or --cpus-per-task.
 -- Fix jobcomp, job_submit and cli_filter Lua implementation plugins causing
slurmctld and/or job submission CLI tools segfaults due to bad return
handling when the respective Lua script failed to load.
 -- Fix propagation of gpu options through hetjob components.
 -- Add SLURM_CLUSTERS environment variable to scancel.
 -- Fix packing/unpacking of "unlinked" jobs.
 -- Connect slurmstepd's stderr to srun for steps launched with --pty.
 -- Handle MPS correctly when doing exclusive allocations.
 -- slurmrestd - fix compiling against libhttpparser in a non-default path.
 -- slurmrestd - avoid compilation issues with libhttpparser < 2.6.
 -- Fix compile issues when compiling slurmrestd without --enable-debug.
 -- Reset idle time on a reservation that is getting purged.
 -- Fix reoccurring reservations that have Purge_comp= to keep correct
duration if they are purged.
 -- scontrol - changed the "PROMISCUOUS" flag to "MAGNETIC"
 -- Early retur

[slurm-users] The Slurm User Group Meeting (SLUG'20) goes virtual this fall

2020-06-15 Thread Tim Wickberg
The Slurm User Group Meeting (SLUG'20) this fall will be moving online, 
and will not be hosted in-person at Harvard as scheduled, due to 
uncertainty around scheduling events caused by COVID-19.[1]


In lieu of an in-person meeting, SchedMD will broadcast a select set of 
presentations on Tuesday, September 15th, 2020. The agenda will be 
announced later this summer, but ask you to hold the date for this 
virtual SLUG meeting.


Due to this change in format and compressed schedule, we will not be 
issuing a call for presentations this year. We expect to resume a normal 
program for SLUG'21.


- Tim

[1] For those who previously registered, all registrations were 
cancelled and refunded as of this afternoon. Please check your inbox for 
receipts from Eventbrite.


--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm versions 20.02.3 and 19.05.7 are now available (CVE-2020-12693)

2020-05-21 Thread Tim Wickberg
Slurm versions 20.02.3 and 19.05.7 are now available, and include a 
series of recent bug fixes, as well as a fix for a security issue with 
the optional message aggregation feature.


SchedMD customers were informed on May 7th and provided a patch on 
request; this process is documented in our security policy [1].


CVE-2020-12693:

A review of what was intended to be a minor cleanup patch uncovered an 
underlying race condition for systems with Message Aggregation enabled. 
This race condition could allow a user to launch a process as an 
arbitrary user.


This is only an issue for systems with Message Aggregation enabled, 
which we expect to be a small number of Slurm installations in practice.


Message Aggregation is off in Slurm by default, and is only enabled by 
MsgAggregationParams=WindowMsgs=, where  is greater than 1. 
(Using Message Aggregation on your systems is not a recommended 
configuration at this time, and we may retire this subsystem in a future 
Slurm release in favor of other RPC aggregation techniques. Although 
care must be taken before disabling this to avoid communication issues.)


Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

[1] https://www.schedmd.com/security.php

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.02.3
==
 -- Factor in ntasks-per-core=1 with cons_tres.
 -- Fix formatting in error message in cons_tres.
 -- Fix calling stat on a NULL variable.
 -- Fix minor memory leak when using reservations with flags=first_cores.
 -- Fix gpu bind issue when CPUs=Cores and ThreadsPerCore > 1 on a node.
 -- Fix --mem-per-gpu for heterogenous --gres requests.
 -- Fix slurmctld load order in load_all_part_state().
 -- Fix race condition not finding jobacct gather task cgroup entry.
 -- Suppress error message when selecting nodes on disjoint topologies.
 -- Improve performance of _pack_default_job_details() with large number of job
arguments.
 -- Fix archive loading previous to 17.11 jobs per-node req_mem.
 -- Fix regresion validating that --gpus-per-socket requires --sockets-per-node
for steps. Should only validate allocation requests.
 -- error() instead of fatal() when parsing an invalid hostlist.
 -- nss_slurm - fix potential deadlock in slurmstepd on overloaded systems.
 -- cons_tres - fix --gres-flags=enforce-binding and related --cpus-per-gres.
 -- cons_tres - Allocate lowest numbered cores when filtering cores with gres.
 -- Fix getting system counts for named GRES/TRES.
 -- MySQL - Fix for handing typed GRES for association rollups.
 -- Fix step allocations when tasks_per_core > 1.
 -- Fix allocating more GRES than requested when asking for multiple GRES types.



* Changes in Slurm 19.05.7
==
 -- Fix handling of -m/--distribution options for across socket/2nd level by
task/affinity plugin.
 -- Fix grp_node_bitmap error when slurmctld started before slurmdbd.
 -- Fix compilation issues in GCC10.
 -- Fix distributing job steps across idle nodes within a job.
 -- Break infinite loop in cons_tres dealing with incorrect tasks per tres
request resulting in slurmctld hang.
 -- priority/multifactor - gracefully handle NULL list of associations or array
of siblings when calculating FairTree fairshare.
 -- Fix cons_tres --exclusive=user to allocate only requested number of CPUs.
 -- Add MySQL deadlock detection and automatic retry mechanism.
 -- Fix _verify_node_state memory requested as --mem-per-gpu DefMemPerGPU.
 -- Factor in ntasks-per-core=1 with cons_tres.
 -- Fix formatting in error message in cons_tres.
 -- Fix gpu bind issue when CPUs=Cores and ThreadsPerCore > 1 on a node.
 -- Fix --mem-per-gpu for heterogenous --gres requests.
 -- Fix slurmctld load order in load_all_part_state().
 -- Fix getting system counts for named GRES/TRES.
 -- MySQL - Fix for handing typed GRES for association rollups.
 -- Fix step allocations when tasks_per_core > 1.




[slurm-users] Slurm versions 20.02.2 is now available

2020-04-30 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 20.02.2.

This includes a series of moderate and minor fixes since the last 
maintenance releases for both branches.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.02.2
==
 -- Fix slurmctld segfault when checking no_consume GRES node allocation counts.
 -- Fix resetting of cloud_dns on a reconfigure.
 -- squeue - change output for dependency column to use "(null)" instead of ""
for no dependncies as documented in the man page, and used by other columns.
 -- Clear node_cnt_wag after job update.
 -- Fix regression where AccountingStoreJobComment was not defaulting to 'yes'.
 -- Send registration message immediately after a node is resumed.
 -- Cray - Fix hetjobs when using only a single component in the step launch.
 -- Cray - Fix hetjobs launched without component 0.
 -- Cray - Quiet cookies missing message which is expected on for hetjobs.
 -- Fix handling of -m/--distribution options for across socket/2nd level by
task/affinity plugin.
 -- Fix grp_node_bitmap error when slurmctld started before slurmdbd.
 -- Fix scheduling issue when there are not enough nodes available to run a job
resulting in possible job starvation.
 -- Make it so mpi/cray_shasta appears in srun --mpi=list
 -- Don't requeue jobs that have been explicitly canceled.
 -- Fix error message for a regular user trying to update licenses on a running
job.
 -- Fix backup slurmctld handling for logrotation via SIGUSR2.
 -- Fix reservation feature specification when looking for inactive features
after active features fails.
 -- Prevent misleading error messages for reservation creation.
 -- Print message in scontrol when a request fails for not having enough nodes.
 -- Fix duplicate output in sacct with multiple resv events.
 -- auth/jwt - return correct gid for a given user. This was incorrectly
assuming the users's primary group name matched their username.
 -- slurmrestd - permit non-SlurmUser/root job submission.
 -- Use host IP if hostname unknown for job submission for allocating node.
 -- Fix issue with primary_slurmdbd_resumed_operation trigger not happening
on slurmctld restart.
 -- Fix race in acct_gather_interconnect/ofed on step termination.
 -- Fix typo of SlurmctldProlog -> PrologSlurmctld in error message.
 -- slurm.spec - add SuSE-specific dependencies for optional slurmrestd package.
 -- Fix FreeBSD build issues.
 -- Fixed sbatch not processing --ignore-pbs in batch script.
 -- Don't clear the qos_id of an invalid QOS.
 -- Allow a job that was once FAIL_[QOS|ACCOUNT] to be eligible again if
the qos|account limitation is remedied.
 -- Fix core reservations using the FLEX flag to allow use of resources
outside of the reservation allocation.
 -- Fix MPS without File with 1 GPU, and without GPUs.
 -- Add FreeBSD support to proctrack/pgid plugin.
 -- Fix remote dependency testing for meta job in job array.
 -- Fix preemption when dealing with a job array.
 -- Don't send remote non-pending singleton dependencies on federation update.
 -- slurmrestd - fix crash on empty query.
 -- Fix race condition which could lead to invalid references in backfill.
 -- Fix edge case in _remove_job_hash().
 -- Fix exit code when using --cluster/-M client options.
 -- Fix compilation issues in GCC10.
 -- Fix invalid references when federated job is revoked while in backfill loop.
 -- Fix distributing job steps across idle nodes within a job.
 -- Fix detected floating reservation overlapping.
 -- Break infinite loop in cons_tres dealing with incorrect tasks per tres
request resulting in slurmctld hang.
 -- Send the current (not the previous) reason for a pending job to client
commands like squeue/scontrol.
 -- Fix incorrect lock levels for select_g_reconfigure().
 -- Handle hidden nodes correctly in slurmrestd.
 -- Allow sacctmgr to use MaxSubmitP[U|A] as format options.
 -- Fix segfault when trying to delete a corrupted association.
 -- Fix setting ntasks-per-core when using --multithread.
 -- Only override job wait reason to priority if Reason=None or
Reason=Resources.
 -- Perl API / seff - fix missing symbol issue with accounting_storage/slurmdbd.
 -- slurm.spec - add --with cray_shasta option.
 -- Downgrade "Node config differ.." error message if config_overrides enabled.
 -- Add client error when using --gpus-per-socket without --sockets-per-node.
 -- Fix nvml/rsmi debug statements making it to stderr.
 -- NodeSets - fix slurmctld segfault in newer glibc if any nodes have no
defined features.
 -- ConfigLess - write out plugstack config to correct config file name in
the config cache.
 -- priority/multifactor - gracefully handle NULL list of associations or array
of siblings when calculating FairTree fairshare.
 -- Fix cons_

[slurm-users] Slurm versions 20.02.1 and 19.05.6 are now available

2020-03-26 Thread Tim Wickberg
We are pleased to announce the availability of Slurm versions 20.02.1 
and 19.05.6.


This includes a series of minor fixes since the last maintenance 
releases for both branches.


Please note that the 19.05.6 release is expected to be the the last 
maintenance release of that branch (barring any critical security 
issues) as our support team has shifted their attention to the 20.02 
release. Also note that support for the 18.08 release ended in 
Februrary; SchedMD customers are encourage to upgrade to a supported 
major release (20.02 or 19.05) at their earliest convenience.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 20.02.1
==
 -- Improve job state reason for jobs hitting partition_job_depth.
 -- Speed up testing of singleton dependencies.
 -- Fix negative loop bound in cons_tres.
 -- srun - capture the MPI plugin return code from mpi_hook_client_fini() and
use as final return code for step failure.
 -- Fix segfault in cli_filter/lua.
 -- Fix --gpu-bind=map_gpu reusability if tasks > elements.
 -- Make sure config_flags on a gres are sent to the slurmctld on node
registration.
 -- Prolog/Epilog - Fix missing GPU information.
 -- Fix segfault when using config parser for expanded lines.
 -- Fix bit overlap test function.
 -- Don't accrue time if job begin time is in the future.
 -- Remove accrue time when updating a job start/eligible time to the future.
 -- Fix regression in 20.02.0 that broke --depend=expand.
 -- Reset begin time on job release if it's not in the future.
 -- Fix for recovering burst buffers when using high-availability.
 -- Fix invalid read due to freeing an incorrectly allocated env array.
 -- Update slurmctld -i message to warn about losing data.
 -- Fix scontrol cancel_reboot so it clears the DRAIN flag and node reason for a
pending ASAP reboot.



* Changes in Slurm 19.05.6
==
 -- Fix OverMemoryKill.
 -- Fix memory leak in scontrol show config.
 -- Remove PART_NODES reservation flag after ignoring it at creation.
 -- Fix deprecation of MemLimitEnforce parameter. >  -- X11 forwarding - alter Xauthority regex to work when "FamilyWild" 

cookies

are present in the "xauth list" output.
 -- Fix memory leak when utilizing core reservations.
 -- Fix issue where adding WCKeys and then using them right away didn't always
work.
 -- Add cosmetic batch step to correct component in a hetjob.
 -- Fix to make scontrol write config create a usable config without editing.
 -- Fix memory leak when pinging backup controller.
 -- Fix issue with 'scontrol update' not enforcing all QoS / Association limits.
 -- Fix to properly schedule certain jobs with cons_tres plugin.
 -- Fix FIRST_CORES for reservations when using cons_tres.
 -- Fix sbcast -C argument parsing.
 -- Replace/deprecate max_job_bf with bf_max_job_test and print error message.
 -- sched/backfill - fix options parsing when bf_hetjob_prio enabled.
 -- Fix for --gpu-bind when no gpus requested.
 -- Fix sshare -l crash with large values.
 -- Fix printing NULL job and step pointers.
 -- Break infinite loop in cons_tres dealing with incorrect tasks per tres
request resulting in slurmctld hang.
 -- Improve handling of --gpus-per-task to make sure appropriate number of GPUs
is assigned to job.




[slurm-users] Slurm version 20.02.0 is now available

2020-02-25 Thread Tim Wickberg
After 9 months of development and testing we are pleased to announce the 
availability of Slurm version 20.02.0!


Downloads are available from https://www.schedmd.com/downloads.php.

Highlights of the 20.02 release include:

- A "configless" method of deploying Slurm within the cluster, in which 
the slurmd and user commands can use DNS SRV records to locate the 
slurmctld host and automatically download the relevant configuration files.


- A new "auth/jwt" authentication mechanism using JWT, which can help 
integrate untrusted external systems into the cluster.


- A new "slurmrestd" command/daemon which translates a new Slurm REST 
API into the underlying libslurm calls.


- Packaging fixes for RHEL8 distributions.

- Significant performance improvements to the backfill scheduler, as 
well as to string construction and processing.


Thank you to all customers, partners, and community members who 
contributed to this release.


As with past releases, the documentation available at 
https://slurm.schedmd.com has been updated to the 20.02 release. Past 
versions are available in the archive. This release also marks the end 
of support for the 18.08 release. The 19.05 release will remain 
supported up until the 20.11 release in November, but will not see as 
frequent updates, and bug-fixes will be targeted for the 20.02 
maintenance releases going forward.


--
Tim Wickberg
Chief Technology Officer, SchedMD
Commercial Slurm Development and Support



[slurm-users] Slurm version 20.02.0rc1 is now available

2020-02-12 Thread Tim Wickberg
We are pleased to announce the availability of Slurm release candidate 
20.02.0rc1.


This is the first release candidate for the upcoming 20.02 release 
series, and marks the finalization of the RPC and state file formats.


This rc1 also includes the first version of the Slurm REST API, as 
implemented in the new slurmrestd command / daemon. The slurmrestd 
command acts as a REST proxy to the libslurm internal API, and can be 
used alongside the new auth/jwt authentication mechanism to integrate 
Slurm into external systems.


A high-level overview of some of the new features and other changes in 
the 20.02 release was presented at SLUG'19, and is archived here:

https://slurm.schedmd.com/publications.html

The Release Notes also include a summary of the major changes:
https://slurm.schedmd.com/archive/slurm-master/news.html

If any issues are identified with this new release candidate, please 
report them through https://bugs.schedmd.com against the 20.02.x version 
and we will address them before the first production 20.02.0 release is 
made.


A preview of the updated documentation can be found at 
https://slurm.schedmd.com/archive/slurm-master/ . Once 20.02 is 
released, the main documentation page at https://slurm.schedmd.com will 
be switched over to this newer content.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 20.02.0pre1 is now available

2020-02-03 Thread Tim Wickberg
We are pleased to announce the availability of Slurm release preview 
version 20.02.0pre1.


This is the first preview of the upcoming 20.02 release series, and 
represents the end of development for the release cycle. The first 
release candidate - 20.02.0rc1 - is expected out next week, and will 
mark the finalization of the RPC and state file formats.


A high-level overview of some of the new features and other changes in 
the 20.02 release was presented at SLUG'19, and is archived here:

https://slurm.schedmd.com/publications.html

The Release Notes also include a summary of the major changes:
https://slurm.schedmd.com/archive/slurm-master/news.html

If any issues are identified with this new release candidate, please 
report them through https://bugs.schedmd.com against the 20.02.x version 
and we will address them before the first production 20.02.0 release is 
made.


A preview of the updated documentation can be found at 
https://slurm.schedmd.com/archive/slurm-master/ . Once 20.02 is 
released, the main documentation page at https://slurm.schedmd.com will 
be switched over to this newer content.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm versions 19.05.5 and 18.08.9 are now available (CVE-2019-19727 and CVE-2019-19728)

2019-12-20 Thread Tim Wickberg
Slurm versions 19.05.5 and 18.08.9 are now available, and include a 
series of recent bug fixes, as well as a fix for two moderate security 
vulnerabilities discussed below.


SchedMD customers were informed on December 11th and provided a patch on 
request; this process is documented in our security policy [1].


CVE-2019-19727:
Johannes Segitz from SUSE reported that slurmdbd.conf may be installed 
with insecure permissions by certain Slurm packaging systems.


Slurm itself - as shipped by SchedMD - does not manage slurmdbd.conf 
directly, but the slurmdbd.conf.example sets a poor example by 
installing itself with 0644 permissions instead of 0600 in both the 
slurm.spec and slurm.spec-legacy packaging scripts.


Sites are encourage to verify that the slurmdbd.conf file - which 
usually will contain your MySQL user and password - is secure on their 
clusters. Note that this configuration file is only needed by the 
slurmdbd primary (and optional backup) servers, and does not need to be 
accessible throughout the cluster.


CVE-2019-19728:

Harald Barth from the KTH Royal Institute of Technology reported that 
"srun --uid" may not always drop into the correct user account, and 
instead will print a warning message but launch the tasks as root.


Note that "srun --uid" is only available to the root user, and that this 
issue is only shown by a race condition between successive lookup calls 
within the srun client command. SchedMD does not recommend use of the 
"srun --uid" option (e.g., it does not load the target user's 
environment but will export the root users) and may remove this option 
in a future release.


Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

[1] https://www.schedmd.com/security.php

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 19.05.5
==
 -- Fix both socket-[un]constrained GRES issues that would lead to incorrect
GRES allocations and GRES underflow errors at deallocation time.
 -- Reject unrunnable jobs submitted to reservations.
 -- Fix misleading error returned for immediate allocation requests when defer
in SchedulerParameters by decoupling defer from too fragmented logic.
 -- Fix printf format string error on FreeBSD.
 -- Fix parsing of delay_boot in controller when additional arguments follow it.
 -- Fix --ntasks-per-node in cons_tres.
 -- Fix array tasks getting same reject reason.
 -- Ignore DOWN/DRAIN partitions in reduce_completing_frag logic.
 -- Fix alloc_node validation when updating a job.
 -- Fix for requesting specific nodes when using cons_tres topology.
 -- Ensure x11 is setup before launching a job step.
 -- Fix incorrect SLURM_CLUSTER_NAME env var in batch step.
 -- Perl API - Fix undefined symbol for slurmdbd_pack_fini_msg.
 -- Install slurmdbd.conf.example with 0600 permissions to encourage secure
use. CVE-2019-19727.
 -- srun - do not continue with job launch if --uid fails. CVE-2019-19728.



* Changes in Slurm 18.08.9
==
 -- Wrap END_TIMER{,2,3} macro definition in "do {} while (0)" block.
 -- Make sview work with glib2 v2.62.
 -- Make Slurm compile on linux after sys/sysctl.h was deprecated.
 -- Install slurmdbd.conf.example with 0600 permissions to encourage secure
use. CVE-2019-19727.
 -- srun - do not continue with job launch if --uid fails. CVE-2019-19728




[slurm-users] Slurm version 19.05.4 is now available, SC19

2019-11-14 Thread Tim Wickberg
Slurm version 19.05.4 is now available, and includes a series of fixes 
since 19.05.3 was released last month ago.


Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

For those of you who will be at SC19 in Denver: we hope to see you at 
the Slurm booth (#1571), and at the Slurm "Birds of a Feather" session 
on Thursday, November 21st, from 12:15 - 1:15pm, in rooms 
401/402/403/404. As always, there will be a number of presentations at 
the Slurm booth - please check the display in the booth for the schedule.


- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 19.05.4
==
 -- Don't allow empty string as a reservation name; generate a name if empty
string is provided.
 -- Fix salloc segfault when using --no-shell option.
 -- Fix divide by zero when normalizing partition priorities.
 -- Restore ability to set JobPriorityFactor to 0 on a partition.
 -- Fix multi-partition non-normalized job priorities.
 -- Adjust precedence between --mem-per-cpu and --mem-per-node to enforce
them as mutually exclusive. Specifying either on the command line will
now explicitly override any value inherited through the environment.
 -- Always print node's version, if it exists, in scontrol show nodes.
 -- sbatch - ensure SLURM_NTASKS_PER_NODE is exported when --ntasks-per-node
is set.
 -- slurmctld - fix memory leak when using DebugFlags=Reservation.
 -- Reset --mem and --mem-per-cpu options correctly when using --mem-per-gpu.
 -- Use correct function signature for step_set_env() in gres plugin interface.
 -- Restore pre-19.05 hostname handling behavior for AllocNodes by always
truncating to just the host portion and dropping any domain name portion
returned by gethostbyaddr().
 -- Fix abort initializing a configuration without acct_gather.conf.
 -- Fix GRES binding and CLOUD nodes GRES setup regressions.
 -- Make sview work with glib2 v2.62.
 -- Fix slurmctld abort when in developer mode and submitting to multiple
partitions with a bad QOS and not enforcing QOS.
 -- Enforce PART_NODES if only PartitionName is specified.
 -- Fix slurmd -G functionality.
 -- Fix build on 32-bit systems.
 -- Remove duplicate log entry on update job.
 -- sched/backfill - fix the estimated sched_nodes for multi-part jobs.
 -- slurm.spec - fix pmix_version global context macro.
 -- Fix cons_tres topology logic incorrectly evaluating insufficient resoruces.
 -- Fix job "--switches=count@time" option handling in cons_tres topology.
 -- scontrol - allow changes to the WorkDir for pending jobs.
 -- Enable coordinators to delete users if they only belong to accounts that
the coordinator is over.
 -- Fix regression on update from older versions with DefMemPerCPU.
 -- Fix issues with --gpu-bind while using cgroups.
 -- Suspend nodes after being down for SuspendTime.
 -- Fix rebooting nodes from skipping nextstate states on boot.
 -- Fix regression in reservation creation logic from 19.05.3 which would
incorrectly deny certain valid reservations from being created.
 -- slurmdbd - process sacct/sacctmgr job queries from older clients correctly.




Re: [slurm-users] Archived docs show 19.05 news

2019-10-25 Thread Tim Wickberg

On 10/25/19 12:49 AM, Benjamin Redling wrote:

Hello everybody,

confusing:

https://slurm.schedmd.com/archive/slurm-18.08.8/news.html
"
RELEASE NOTES FOR SLURM VERSION 19.05
28 May 2019
...
"


This is fixed now.


Bug-tracking is only via commercial support?


Anyone is welcome to file issues.

However for sites without support contracts our engineers generally 
won't have time to respond. Issues with the website we'd still prefer to 
see sent in as a ticket and will usually address ASAP regardless of 
whether reported by a customer or wider community member.


- Tim



[slurm-users] Slurm User Group 2019 (SLUG19) presentations online, SC19

2019-10-15 Thread Tim Wickberg
Many thanks to all the attendees, and especially to all those who 
presented at the Slurm User Group 2019 meeting in Salt Lake City. Thank 
you to the University of Utah as well for hosting.


I hope to see many of you again at SLUG'20, which at Harvard University 
on September 15-16, 2020.


PDFs of the presentations are online at
http://slurm.schedmd.com/publications.html

For those of you who will be at SC19 in Denver - we hope to see you at 
the Slurm booth (#1571), and at the Slurm "Birds of a Feather" session 
on Thursday, November 21st, from 12:15 - 1:15pm, in rooms 
401/402/403/404. As always, there will be a number of presentations in 
the Slurm booth - please check the display in the booth for the full 
schedule.


- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 19.05.3 is now available

2019-10-03 Thread Tim Wickberg
Slurm version 19.05.3 is now available, and includes a series of fixes 
since 19.05.2 was released nearly two months ago.


Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 19.05.3
==
 -- Fix missing check from conversion of cray -> cray_aries.
 -- Improve job state reason string when required nodes are not available by
not including those that don't belong to the job partition.
 -- Set a more appropriate ESLURM_RESERVATION_MAINT job state reason for jobs
requesting feature(s) and required nodes are in a maintenance reservation.
 -- Fix logic to better handle maintenance reservations.
 -- Add spank options to cache in remote callback.
 -- Enforce the use of spank_option_getopt().
 -- Fix select plugins' will run test under-allocating nodes usage for
completing jobs.
 -- Nodes in COMPLETING state treated as being currently available for job
will-run test.
 -- Cray - fix contribs slurm.conf.j2 with updated cray_aries plugin names.
 -- job_submit/lua - fix problem where nil was expected for min_mem_per_cpu.
 -- Fix extra, unaccounted TRESRunMins usage created by heterogeneous jobs when
running with the priority/multifactor plugin.
 -- Detach threads once they are done to avoid having to join them
in track scripts code.
 -- Handle situation where a slurmctld tries to communicate with slurmdbd more
than once at the same time.
 -- Fix XOR/XAND features like cpu&[knl|westmere] to be resolved
correctly.
 -- Don't update [min|max]_exit_code on job array task requeue.
 -- Don't assume the first node of a job is the batch host when testing if the
job's allocated nodes are booted/ready.
 -- Make --batch= requests wait for all nodes to be booted so that it
can choose the batch host after the nodes have been booted -- possibly with
different features.
 -- Fix talking to batch host on it's protocol version when using --batch.
 -- gres/mic plugin - add missing fini() function to clean up plugin state.
 -- Move _validate_node_choice() before prolog/epilog check.
 -- Look forward one week while create new reservation.
 -- Set mising resv_desc.flags before call _select_nodes().
 -- Use correct start_time for TIME_FLOAT reservation in _job_overlap().
 -- Properly enforce a job's mem-per-cpu option when allocate the node
exclusively to that job.
 -- sched/backfill - clear estimated sched_nodes as done for start_time.
 -- Have safe_[read|write] handle EAGAIN and EINTR.
 -- Fix checking for flag with logical AND.
 -- Correct "extern" definition of variable if compiling with __APPLE__.
 -- Deprecate FastSchedule. FastSchedule will be removed in 20.02.
The FastSchedule=2 functionality (used for testing and development) has
been retained as the new SlurmdParameters=config_overrides option.
 -- Fix preemption issue when picking nodes for a feature job request.
 -- Fix race condition preventing held array job from getting a db_index.
 -- Fix select/cons_tres gres code infinite loop leaving slurmctld unresponsive.
 -- Remove redefinition of global variable in gres.c
 -- Fix issue where GPU devices are denied access when MPS is enabled.
 -- Fix uninitialized errors when compiling with CFLAGS="--coverage".
 -- Fix scancel --full for proctrack/cgroups.
 -- Fix sdiag backfill last and mean queue length stats.
 -- Do not remove batch host when resizing/shrinking a batch job.
 -- nss_slurm - fix file descriptor leaks.
 -- Fix preemption for jobs using complex feature requests
(e.g. -C "[rack1*2*4]").
 -- Fix memory leaks in preemption when jobs request multiple features.
 -- Allow Operator users to show/fix runaways.
 -- Disallow coordinators to show/fix runaways.
 -- mpi/pmi2 - increase array len to avoid buffer size exceeded error.
 -- Preserve rebooting node's nextstate when updating state with scontrol.
 -- Fully merge slurm.conf and gres.conf before node_config_load().
 -- Remove FastSchedule dependence from gres.conf's AutoDetect=nvml.
 -- Forbid mix of typed and untyped GRES of same name in slurm.conf.
 -- cons_tres: Prevent creating a job without CPUs.
 -- Prevent underflow when filtering cores with gres.
 -- proctrack/cray_aries: use current pid instead of thread if we're in a fork.
 -- Fix missing check for prolog launch credential creation failure that can
lead to segfaults




Re: [slurm-users] Up-to-date agenda for SLUG 2019?

2019-09-16 Thread Tim Wickberg
Thanks for the reminder. The final version is online now. (The only 
important change is that the time for dinner has been filled in, and the 
schedule is no longer marked as preliminary.)


See you folks tomorrow!

- Tim

On 9/16/19 8:53 AM, Bjørn-Helge Mevik wrote:

The agenda on https://slurm.schedmd.com/slurm_ug_agenda.html is still
called "Preliminary Schedule", and has not been updated since July 19.

Is this the latest agenda, or is there a newer one somewhere?





[slurm-users] Slurm version 19.05.2 is now available

2019-08-13 Thread Tim Wickberg
Slurm version 19.05.2 is now available, and includes a series of minor 
bug fixes since 19.05.1 was released over a month ago.


Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 19.05.2
==
 -- Wrap END_TIMER{,2,3} macro definition in "do {} while (0)" block.
 -- Allow account coordinators to add users who don't already have an
association with any account.
 -- If only allowing particular alloc nodes in a partition, deny any request
coming from an alloc node of NULL.
 -- Prevent partial-load of plugins which can leave certain interfaces in
an inconsistent state.
 -- Remove stray __USE_GNU macro definitions from source.
 -- Fix loading fed state by backup on subsequent takeovers.
 -- Add missing job read lock when loading fed job state.
 -- Add missing fed_job_info jobs if fed state is lost.
 -- Do not build cgroup plugins on FreeBSD or NetBSD, and use proctrack/pgid
by default instead.
 -- Do not build switch/cray_aries plugin on FreeBSD, NetBSD, or macOS.
 -- Fix build on FreeBSD.
 -- Fix race condition in route/topology plugin.
 -- In munge decode set the alloc_node field to the text representation of an
IP address if the reverse lookup fails.
 -- Fix infinite loop in slurmstepd handling for nss_slurm REQUEST_GETGR RPC.
 -- Fix slurmstepd early assertion fail which prevented batch job launch or
tasks launch on non-Linux systems.
 -- Fix regression with SLURM_STEP_GPUS env var being renamed SLURM_STEP_GRES.
 -- Fix pmix v3 linking if no rpath is allowed on build.
 -- Fix sacctmgr error handling when removing associations and users.
 -- Allow sacctmgr to add users to WCKeys without having TrackWCKey set in the
slurm.conf.
 -- Allow sacctmgr to delete WCKeys from users.
 -- Change GRES type set by gpu/gpu_nvml plugin to be more specific - based
on device name instead of brand name.
 -- cli_filter - fix logic error with option lookup functions.
 -- Fix bad testing of NodeFeatures debug flag in contribs/cray.
 -- Cleanup track_script code to avoid race conditions and invalid memory
access.
 -- Fix jobs being killed after being requeued by preemption.
 -- Make register nodes verify correctly when using cons_tres.
 -- Fix srun --mem-per-cpu being ignored.
 -- Fix segfault in _update_job() under certain conditions.
 -- job_submit/lua - restore slurm.FAILURE as a synonym for slurm.ERROR.




[slurm-users] Slurm versions 19.05.1 and 18.08.8 are now available (CVE-2019-12838)

2019-07-10 Thread Tim Wickberg
Slurm versions 19.05.1 and 18.08.8 are now available, and include a 
series of recent bug fixes, as well as a fix for a security 
vulnerability (CVE-2019-12838) related to the 'sacctmgr archive load' 
functionality.


While fixes are only available for the currently supported 19.05 and 
18.08 releases, similar vulnerabilities affect past versions as well and 
sites are encourage to upgrade to a supported version.


SchedMD customers were informed on June 26th and provided a patch on 
request; this process is documented in our security policy [1].


Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

[1] https://www.schedmd.com/security.php

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 19.05.1
==
 -- accounting_storage/mysql - fix incorrect function names in error messages.
 -- accounting_storage/slurmdbd - trigger an fsync() on the dbd.messages state
file to ensure it is committed to disk properly.
 -- Avoid JobHeldUser state reason from being updated at allocation time.
 -- Fix dump/load of rejected heterogeneous jobs.
 -- For heterogeneous jobs, do not count the each component against the QOS or
association job limit multiple times.
 -- Comment out documentation for the incomplete and currently unusable
burst_buffer/generic plugin.
 -- Add new error ESLURM_INVALID_TIME_MIN_LIMIT to make note when a time_min
limit is invalid based on timelimit.
 -- Correct slurmdb cluster record pack with NULL pointer input.
 -- Clearer error message for ESLURM_INVALID_TIME_MIN_LIMIT.
 -- Fix SchedulerParameter bf_min_prio_reserve error when not the last parameter
 -- When fixing runaway jobs, change to reroll from earliest submit time, and
never reroll from Unix epoch.
 -- Display submit time when running sacctmgr show runawayjobs and add format
option to display eligible time.
 -- jobcomp/elasticsearch - fix minor race related to JobCompLoc setup.
 -- For HetJobs, ensure SLURM_PACK_JOB_ID is set regardless of whether
PrologFlags=Alloc is enabled.
 -- Fix PriorityFlags regression with the mutation of FAIR_TREE to NO_FAIR_TREE.
 -- select/cons_res - fix debug flag SelectType handling in select_p_job_test.
 -- Fix sacctmgr archive dump commit confirmation.
 -- Prevent extra resources from being allocated when combining certain flags.
 -- Cray - fix template generator with update cray_aries plugin names.
 -- accounting_storage/slurmdbd - provide additional detail in several error
messages.
 -- Backfill - If a job has a time_limit guess the end time of a job better
if OverTimeLimit is Unlimited.
 -- Remove premature call to get system gpus before querying fake gpus that
should override the real.
 -- Fix segfault in epilog_set_env() when gres_devices is NULL.
 -- Fix (un)supported states in sacct.
 -- Adjust build system to no longer use the AC_FUNC_MALLOC autoconf macro.
 -- srun - restore the --cpu_bind option to srun.
 -- Add UsageFactorSafe QOS flag to control applying UsageFactor at
submission/scheduling time.
 -- Create missing reservations on DBD_MODIFY_RESV.
 -- Add error message when attempting to update association manager and object
doesn't exist.
 -- Fix security issue in accounting_storage/mysql plugin on archive file loads
by always escaping strings within the slurmdbd. CVE-2019-12838.



* Changes in Slurm 18.08.7
==
 -- Set debug statement to debug2 to avoid benign error messages.
 -- Add SchedulerParameters option of bf_hetjob_immediate to attempt to start
a heterogeneous job as soon as all of its components are determined able to
do so.
 -- Fix underflow causing decay thread to exit.
 -- Fix main scheduler not considering hetjobs when building the job queue.
 -- Fix regression for sacct to display old jobs without a start time.
 -- Fix setting correct number of gres topology bits.
 -- Update hetjobs pending state reason when appropriate.
 -- Fix accounting_storage/filetxt's understanding of TRES.
 -- Set Accrue time when not enforcing limits.
 -- Fix srun segfault when requesting a hetjob with test_exec or bcast options.
 -- Hide multipart priorities log message behind Priority debug flag.
 -- sched/backfill - Make hetjobs sensitive to bf_max_job_start.
 -- Fix slurmctld segfault due to job's partition pointer NULL dereference.
 -- Fix issue with OR'ed job dependencies.
 -- Add new job's bit_flags of INVALID_DEPEND to prevent rebuilding a job's
dependency string when it has at least one invalid and purged dependency.
 -- Promote federation unsynced siblings log message from debug to info.
 -- burst_buffer/cray - fix slurmctld SIGABRT due to illegal read/writes.
 -- burst_buffer/cray - fix memory leak due to unfreed job script content.
 -- node_features/knl_cray - fix script_argv use-after-free.
 -- burst_buffer/cray - fix script_argv use-after-free.
 -- Fix invalid reads

Re: [slurm-users] Call for Abstracts - 2019 Slurm User Group Meeting

2019-06-24 Thread Tim Wickberg
This is a combined reminder and extension for the Call for Abstracts for 
presentations for the 2019 Slurm User Group Meeting. The deadline is now 
extended by an additional week - abstracts must now be received by July 
5th for consideration.


As an additional reminder, early registration will end on July 14th, 
after which time the registration fee will increase to the standard rate.


Please contact Jacob with abstract submissions, or any questions.

- Tim

On 05/14/2019 02:02 PM, Jacob Jenson wrote:
You are invited to submit an abstract of a tutorial, technical 
presentation or site report to be given at the 2019 Slurm User 
Group Meeting. This event is sponsored and organized by the University 
of Utah and SchedMD. This international event is opened to those who 
want to:


  * Learn more about Slurm, a highly scalable Resource Manager and Job
Scheduler
  * Share their knowledge and experience with other users and
administrators
  * Get detailed information about the latest features and developments
  * Share requirements and discuss future developments

Everyone who wants to present their own usage, developments, site 
report, or tutorial about Slurm is invited to send an abstract to 
sl...@schedmd.com 


*Important Dates:*
28 June 2019: Abstracts due
12 July 2019: Notification of acceptance

*Slurm User Group Meeting 2019*
17-18 September 2019
Salt Lake City Utah



On 05/14/2019 02:03 PM, Jacob Jenson wrote:
Registration for the 2019 Slurm User Group Meeting is open. You can register at https://slug19.eventbrite.com/ 


The meeting will be held on 17-18 September 2019 in Salt Lake City at the 
University of Utah

Early registration
May 14 through July 14
$300 USD
Standard registration
July 15 through August 15
$375 USD
Late registration
August 16 through August 31
$600 USD

A block of rooms have been reserved a the University Guest House for those attending. The University Guest house is conveniently located next to the conference meeting room. You can reserve a room at the Guest House by calling +1-888-416-4075 by August 16, 2019. Please mention the group name “Slurm User Group 2019” in order to receive a discounted room rate. Online reservations can be made at https://www.universityguesthouse.com/University-Guest-House  

Please contact me with any questions regarding the 2019 Slurm User Group Meeting. 


Jacob




[slurm-users] Slurm version 19.05.0 is now available

2019-05-28 Thread Tim Wickberg
After 9 months of development and testing we are pleased to announce the 
availability of Slurm version 19.05.0!


Downloads are available from https://www.schedmd.com/downloads.php.

Highlights of the 19.05 release include:

- The new select/cons_tres plugin, which introduces new GPU-specific job 
submission options, and extends Slurm's backfill scheduling logic to 
cover resources beyond just cpus and memory.


- A new NSS library - nss_slurm - has been developed, which can provide 
directory info for the job step's user to local processes.


- Heterogeneous Job support on Cray Aries systems.

- A new "Association" priority factor, and corresponding 
PriorityWeightAssoc setting, providing for an alternative approach to 
establishing relative priority values between groups.


- Two new plugin APIs intended for sites to customize their Slurm 
installations: cli_filter and site_factor.


Thank you to all customers, partners, and community members who 
contributed to getting this release done.


As with past releases, the documentation available at 
https://slurm.schedmd.com has been updated to the 19.05 release. Past 
versions are available in the archive. This release also marks the end 
of support for the 17.11 release. The 18.08 release will remain 
supported up until the 20.02 release in February, but will stop 
receiving as frequent updates, and bug-fixes will be targeted for the 
19.05 maintenance releases going forward.


--
Tim Wickberg
Chief Technology Officer, SchedMD
Commercial Slurm Development and Support



[slurm-users] Slurm release candidate version 19.05.0rc1 available for testing

2019-04-30 Thread Tim Wickberg
We are pleased to announce the availability of Slurm release candidate 
version 19.05.0rc1.


This is the first release candidate version of the upcoming 19.05 
release series, and represents the end of development for the release 
cycle, and a finalization of the RPC and state file formats.


If any issues are identified with this new release candidate, please 
report them through https://bugs.schedmd.com against the 19.05.x version 
and we will address them before the first production 19.05.0 release is 
made.


Please note that the release candidates are not intended for production 
use. Barring any late-discovered issues, the state file formats should 
not change between now and 19.05.0 and are considered frozen at this 
time for the 19.05 release.


A preview of the updated documentation can be found at 
https://slurm.schedmd.com/archive/slurm-master/ .


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 18.08.7 is now available

2019-04-11 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 18.08.7.

This includes over 20 fixes since 18.08.6 was released last month, 
include one for a regression that caused issues with 'sacct -J' not 
returning results correctly.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 18.08.7
==
 -- Set debug statement to debug2 to avoid benign error messages.
 -- Add SchedulerParameters option of bf_hetjob_immediate to attempt to start
a heterogeneous job as soon as all of its components are determined able to
do so.
 -- Fix underflow causing decay thread to exit.
 -- Fix main scheduler not considering hetjobs when building the job queue.
 -- Fix regression for sacct to display old jobs without a start time.
 -- Fix setting correct number of gres topology bits.
 -- Update hetjobs pending state reason when appropriate.
 -- Fix accounting_storage/filetxt's understanding of TRES.
 -- Set Accrue time when not enforcing limits.
 -- Fix srun segfault when requesting a hetjob with test_exec or bcast options.
 -- Hide multipart priorities log message behind Priority debug flag.
 -- sched/backfill - Make hetjobs sensitive to bf_max_job_start.
 -- Fix slurmctld segfault due to job's partition pointer NULL dereference.
 -- Fix issue with OR'ed job dependencies.
 -- Add new job's bit_flags of INVALID_DEPEND to prevent rebuilding a job's
dependency string when it has at least one invalid and purged dependency.
 -- Promote federation unsynced siblings log message from debug to info.
 -- burst_buffer/cray - fix slurmctld SIGABRT due to illegal read/writes.
 -- burst_buffer/cray - fix memory leak due to unfreed job script content.
 -- node_features/knl_cray - fix script_argv use-after-free.
 -- burst_buffer/cray - fix script_argv use-after-free.
 -- Fix invalid reads of size 1 due to non null-terminated string reads.
 -- Add extra debug2 logs to identify why BadConstraints reason is set.




[slurm-users] Slurm versions 18.08.6 is now available, as well as 19.05.0pre2, and Slurm on GCP update

2019-03-07 Thread Tim Wickberg
We are pleased to announce the availability of Slurm version 18.08.6, as 
well as the second 19.05 release preview version 19.05.0pre2.


The 18.08.6 includes over 50 fixes since the last maintenance release 
was made five weeks ago.


The second preview of the 19.05 release - 19.05.0pre1 - is meant to 
highlight additional functionality coming with the new select/cons_tres 
plugin, alongside other recent development work. Please consult the 
RELEASE_NOTES file for a detailed list of changes made to date.


Please note that preview releases are meant for testing and development 
only, and should not be used in production, are not supported, and that 
you cannot migrate to a newer release from these without potential loss 
of data and your job queues.


I'd also like to call attention to some of our recent work in 
partnership with Google. There's a blog post today highlighting some of 
this recent work both on Slurm and with the slurm-gcp integration 
scripts (https://github.com/SchedMD/slurm-gcp):


https://cloud.google.com/blog/products/compute/hpc-made-easy-announcing-new-features-for-slurm-on-gcp
Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 18.08.6
==
 -- Added parsing of -H flag with scancel.
 -- Fix slurmsmwd build on 32-bit systems.
 -- acct_gather_filesystem/lustre - add support for Lustre 2.12 client.
 -- Fix per-partition TRES factors/priority
 -- Fix per-partition NICE priority
 -- Fix partition access check validation for multi-partition job submissions.
 -- Prevent segfault on empty response in 'scontrol show dwstat'.
 -- node_features/knl_cray plugin - Preserve node's active features if it has
already booted when slurmctld daemon is reconfigured.
 -- Detect missing burst buffer script and reject job.
 -- GRES: Properly reset the topo_gres_cnt_alloc counter on slurmctld restart
to prevent underflow.
 -- Avoid errors from packing accounting_storage_mysql.so when RPM is built
with out mysql support.
 -- Remove deprecated -t option from slurmctld --help.
 -- acct_gather_filesystem/lustre - fix stats gathering.
 -- Enforce documented default usage start and end times when querying jobs from
the database.
 -- Fix issues when querying running jobs from the database.
 -- Deny sacct request where start time is later than the end time requested.
 -- Fix sacct verbose about time and states queried.
 -- burst_buffer/cray - allow 'scancel --hurry ' to tear down a burst
buffer that is currently staging data out.
 -- X11 forwarding - allow setup if the DISPLAY environment variable lacks
a screen number. (Permit both "localhost:10.0" and "localhost:10".)
 -- docs - change HTML title to include the page title or man page name.
 -- X11 forwarding - fix an unnecessary error message when using the
local_xauthority X11Parameters option.
 -- Add use_raw_hostname to X11Parameters.
 -- Fix smail so it passes job arrays to seff correctly.
 -- Don't check InactiveLimit for salloc --no-shell jobs.
 -- Add SALLOC_GRES and SBATCH_GRES as input to salloc/sbatch.
 -- Remove drain state when node doesn't reboot by ResumeTimeout.
 -- Fix considering "resuming" nodes in scheduling.
 -- Do not kill suspended jobs due to exceeding time limit.
 -- Add NoAddrCache CommunicationParameter.
 -- Don't ping powering up cloud nodes.
 -- Add cloud_dns SlurmctldParameter.
 -- Consider --sbindir configure option as the default path to find slurmstepd.
 -- Fix node state printing of DRAINED$
 -- Fix spamming dbd of down/drained nodes in maintenance reservation.
 -- Avoid buffer overflow in time_str2secs.
 -- Calculate suspended time for suspended steps.
 -- Add null check for step_ptr->step_node_bitmap in _pick_step_nodes.
 -- Fix multi-cluster srun issue after 'scontrol reconfigure' was called.
 -- Fix accessing response_cluster_rec outside of write locks.
 -- Fix Lua user messages not showing up on rejected submissions.
 -- Fix printing multi-line error messages on rejected submissions.




Re: [slurm-users] Slurm versions 17.11.13 and 18.08.5 are now available (CVE-2019-6438)

2019-01-30 Thread Tim Wickberg

Forgot to attach the release notes, they are included below for reference:


* Changes in Slurm 18.08.5
==
 -- Backfill - If a job has a time_limit guess the end time of a job better
if OverTimeLimit is Unlimited.
 -- Fix "sacctmgr show events event=cluster"
 -- Fix sacctmgr show runawayjobs from sibling cluster
 -- Avoid bit offset of -1 in call to bit_nclear().
 -- Insure that "hbm" is a configured GresType on knl systems.
 -- Fix NodeFeaturesPlugins=node_features/knl_generic to allow other gres
other than knl.
 -- cons_res: Prevent overflow on multiply.
 -- Better debug for bad values in gres.conf.
 -- Fix double accounting of energy at end of job.
 -- Read gres.conf for cloud nodes on slurmctld.
 -- Don't assume the first node of a job is the batch host when purging jobs
from a node.
 -- Better debugging when a job doesn't have a job_resrcs ptr.
 -- Store ave watts in energy plugins.
 -- Add XCC plugin for reading Lenovo Power.
 -- Fix minor memory leak when scheduling rebootable nodes.
 -- Fix debug2 prefix for sched log.
 -- Fix printing correct SLURM_JOB_ACCOUNT_PACK_GROUP_* in env for a Het Job.
 -- sbatch - search current working directory first for job script.
 -- Make it so held jobs reset the AccrueTime and do not count against any
AccrueTime limits.
 -- Add SchedulerParameters option of bf_hetjob_prio=[min|avg|max] to alter the
job sorting algorithm for scheduling heterogeneous jobs.
 -- Fix initialization of assoc_mgr_locks and slurmctld_locks lock structures.
 -- Fix segfault with job arrays using X11 forwarding.
 -- Revert regression caused by e0ee1c7054 which caused negative values and
values starting with a decimal to be invalid for PriorityWeightTRES and
TRESBillingWeight.
 -- Fix possibility to update a job's reservation to none.
 -- Suppress connection errors to primary slurmdbd when backup dbd is active.
 -- Suppress connection errors to primary db when backup db kicks in
 -- Add missing fields for sacct --completion when using jobcomp/filetxt.
 -- Fix incorrect values set for UserCPU, SystemCPU, and TotalCPU sacct fields
when JobAcctGatherType=jobacct_gather/cgroup.
 -- Fixed srun from double printing invalid option msg twice.
 -- Remove unused -b flag from getopt call in sbatch.
 -- Disable reporting of node TRES in sreport.
 -- Re-enabling features combined by OR within parenthesis for non-knl setups.
 -- Prevent sending duplicate requests to reboot a node before ResumeTimeout.
 -- Down nodes that don't reboot by ResumeTimeout.
 -- Update seff to reflect API change from rss_max to tres_usage_in_max.
 -- Add missing TRES constants from perl API.
 -- Fix issue where sacct would return incorrect array tasks when querying
specific tasks.
 -- Add missing variables to slurmdb_stats_t in the perlapi.
 -- Fix nodes not getting reboot RPC when job requires reboot of nodes.
 -- Fix failing update the partition list of a job.
 -- Use slurm.conf gres ids instead of gres.conf names to get a gres type name.
 -- Add mitigation for a potential heap overflow on 32-bit systems in xmalloc.
CVE-2019-6438.



* Changes in Slurm 17.11.13
===
 -- Add mitigation for a potential heap overflow on 32-bit systems in xmalloc.
CVE-2019-6438.




[slurm-users] Slurm versions 17.11.13 and 18.08.5 are now available (CVE-2019-6438)

2019-01-30 Thread Tim Wickberg
Slurm versions 17.11.13 and 18.08.5 are now available, and include a 
series of recent bug fixes, as well as a fix for a security 
vulnerability (CVE-2019-6438) on 32-bit systems. We believe that 64-bit 
builds - the overwhelming majority of installations - of Slurm are not 
affected by this issue.


Downloads are available at https://www.schedmd.com/downloads.php .

While fixes are only available for the supported 17.11 and 18.08 
releases, similar vulnerabilities affect 32-bit builds on past versions 
as well. The only resolution is to upgrade Slurm to a fixed release.


SchedMD customers were informed on January 16th and provided a patch on 
request; this process is documented in our security policy [1].


Release notes follow below.

- Tim

[1] https://www.schedmd.com/security.php

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 18.08.4 is now available

2018-12-11 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 18.08.4.

This includes over 70 fixes since 18.08.3 was released in October.

Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 18.08.4
==
 -- burst_buffer/cray - avoid launching a job that would be immediately
cancelled due to a DataWarp failure.
 -- Fix message sent to user to display preempted instead of time limit when
a job is preempted.
 -- Fix memory leak when a failure happens processing a nodes gres config.
 -- Improve error message when failures happen processing a nodes gres config.
 -- When building rpms ignore redundant standard rpaths and insecure relative
rpaths, for RHEL based distros which use "check-rpaths" tool.
 -- Don't skip jobs in scontrol hold.
 -- Avoid locking the job_list when unneeded.
 -- Allow --cpu-bind=verbose to be used with SLURM_HINT environment variable.
 -- Make it so fixing runaway jobs will not alter the same job requeued
when not runaway.
 -- Avoid checking state when searching for runaway jobs.
 -- Remove redundant check for end time of job when searching for runaway jobs.
 -- Make sure that we properly check for runawayjobs where another job might
have the same id (for example, if a job was requeued) by also checking the
submit time.
 -- Add scontrol update job ResetAccrueTime to clear a job's time
previously accrued for priority.
 -- cons_res: Delay exiting cr_job_test until after cores/cpus are calculated
and distributed.
 -- Fix bug where binary in cwd would trump binary in PATH with test_exec.
 -- Fix check to test printf("%s\n", NULL); to not require
-Wno-format-truncation CFLAG.
 -- Fix JobAcctGatherParams=UsePss to report the correct usage.
 -- Fix minor memory leak in pmix plugin.
 -- Fix minor memory leak in slurmctld when reading configuration.
 -- Handle return codes correctly from pthread_* functions.
 -- Fix minor memory leak when a slurmd is unable to contact a slurmctld
when trying to register.
 -- Fix sreport sizesbyaccount report when using Flatview and accounts.
 -- Fix incorrect shift when dealing with node weights and scheduling.
 -- libslurm/perl - Fix segfault caused by incorrect hv_to_slurm_ctl_conf.
 -- Add qos and assoc options to confirmation dialogs.
 -- Handle updating identical license or partition information correctly.
 -- Makes sure accounts and QOS' are all lower case to match documentation
when read in from the slurm.conf file.
 -- Don't consider partitions without enough nodes in reservation,
main scheduler.
 -- Set SLURM_NTASKS correctly if having to determine from other options.
 -- Removed GCP scripts from contribs. Now located at:
https://github.com/SchedMD/slurm-gcp.
 -- Don't check existence of srun --prolog or --epilog executables when set to
"none" and SLURM_TEST_EXEC is used.
 -- Add "P" suffix support to job and step tres specifications.
 -- When doing a reconfigure handle QOS' GrpJobsAccrue correctly.
 -- Remove unneeded extra parentheses from sh5util.
 -- Fix jobacct_gather/cgroup to work correctly when more than one task is
started on a node.
 -- If requesting --ntasks-per-node with no tasks set tasks correctly.
 -- Accept modifiers for TRES originally added in 6f0342e0358.
 -- Don't remove reservation on slurmctld restart if nodes are removed from
configuration.
 -- Fix bad xfree in task/cgroup.
 -- Fix removing counters if a job array isn't subject to limits and is
canceled while pending.
 -- Make sure SLURM_NTASKS_PER_NODE is set correctly when env is overwritten
by the command line.
 -- Clean up step on a failed node correctly.
 -- mpi/pmix: Fixed the logging of collective state.
 -- mpi/pmix: Make multi-slurmd work correctly when using ring communication.
 -- mpi/pmix: Fix double invocation of the PMIx lib fence callback.
 -- mpi/pmix: Remove unneeded libpmix callback drop in tree-based coll.
 -- Fix race condition in route/topology when the slurmctld is reconfigured.
 -- In route/topology validate the slurmctld doesn't try to initialize the
node system.
 -- Fix issue when requesting invalid gres.
 -- Validate job_ptr in backfill before restoring preempt state.
 -- Fix issue when job's environment is minimal and only contains variables
Slurm is going to replace internally.
 -- When handling runaway jobs remove all usage before rollup to remove any
time that wasn't existent instead of just updating lines that have time
with a lesser time.
 -- salloc - set SLURM_NTASKS_PER_CORE and SLURM_NTASKS_PER_SOCKET in the
environment if the corresponding command line options are used.
 -- slurmd - fix handling of the -f flag to specify alternate config file
locations.
 -- Fix scheduling logic to avoid using nodes that require a reboot for KNL
node change whe

[slurm-users] Slurm User Group 2018 presentations online, SC18

2018-11-11 Thread Tim Wickberg
Many thanks to all the attendees, and especially to all those who 
presented at the Slurm User Group 2018 meeting in Madrid. Thank you to 
CIEMAT as well for hosting, and I hope to see many of you at SLUG'19 at 
the University of Utah in Salt Lake City.


PDFs of the presentations are online at
http://slurm.schedmd.com/publications.html

For those of you who will be at SC18 in Dallas - we hope to see you at 
the Slurm booth (#1242), and at the Slurm "Birds of a Feather" session 
on Thursday, November 15th, from 12:15 - 1:15pm, in rooms C155/C156. As 
always, there will be a number of presentations in the Slurm booth - 
please check the display for the full schedule.


- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm versions 18.08.3 and 17.11.12 are now available

2018-10-24 Thread Tim Wickberg
We are pleased to announce the availability of Slurm versions 18.08.3 
and 17.11.12.


These versions include a fix for a regression introduced in 18.08.2 and 
17.11.11 that could lead to a loss of accounting records if the slurmdbd 
was offline. All sites with 18.08.2 or 17.11.11 slurmctld processes are 
encouraged to upgrade them ASAP.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 18.08.3
==
 -- Fix regression in 18.08.2 that caused dbd messages to not be queued up
when the dbd was down.
 -- Fix regression in 18.08.1 that can cause a slurmctld crash when splitting
job array elements.



* Changes in Slurm 17.11.12
===
 -- Fix regression in 17.11.10 that caused dbd messages to not be queued up
when the dbd was down.




[slurm-users] Slurm versions 18.08.2 and 17.11.11 are now available, as well as 19.05.0pre1

2018-10-18 Thread Tim Wickberg
We are pleased to announce the availability of Slurm versions 18.08.2 
and 17.11.11, as well as the first 19.05 release preview version 
19.05.0pre1.


These versions include a fix for a regression introduced in 18.08.1 and 
17.11.10 that prevented the --get-user-env option from working 
correctly, alongside a few other minor changes.


The first preview of the 19.05 release - 19.05.0pre1 - is meant to 
highlight additional functionality coming with the new select/cons_tres 
plugin. Further details on this are in the presentation from SLUG'18 
which will be online (along with the rest of the SLUG'18 presentations) 
in the next week. Please note that preview releases are meant for 
testing and development only, and should not be used in production, are 
not supported, and that you cannot migrate to a newer release from these 
without potential loss of data and your job queues.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 18.08.2
==
 -- Correctly initialize variable in env_array_user_default().
 -- Remove race condition when signaling starting step.
 -- Fix issue where 17.11 job's using GRES in didn't initialize new 18.08
structures after unpack.
 -- Stop removing nodes once the minimum CPU or node count for the job is
reached in the cons_res plugin.
 -- Process any changes to MinJobAge and SlurmdTimeout in the slurmctld when
it is reconfigured to determine changes in its background timers.
 -- Use previous SlurmdTimeout in the slurmctld after a reconfigure to
determine the time a node has been down.
 -- Fix multi-cluster srun between clusters with different SelectType plugins.
 -- Fix removing job licenses on reconfig/restart when configured license
counts are 0.
 -- If a job requested multiple licenses and one license was removed then on
a reconfigure/restart all of the licenses -- including the valid ones
would be removed.
 -- Fix issue where job's license string wasn't updated after a restart when
licenses were removed or added.
 -- Add allow_zero_lic to SchedulerParameters.
 -- Avoid scheduling tasks in excess of ArrayTaskThrottle when canceling tasks
of an array.
 -- Fix jobs that request memory per node and task count that can't be
scheduled right away.
 -- Avoid infinite loop with jobacct_gather/linux when pids wrap around
/proc/sys/kernel/pid_max.
 -- Fix --parsable2 output for sacct and sstat commands to remove a stray
trailing delimiter.
 -- When modifying a user's name in sacctmgr enforce PreserveCaseUser.
 -- When adding a coordinator or user that was once deleted enforce
PreserveCaseUser.
 -- Correctly handle scenarios where a partitions MaxMemPerCPU is less than
a jobs --mem-per-cpu and also -c is greater than 1.
 -- Set AccrueTime correctly when MaxJobsAccrue is disabled and BeginTime has
not been established.
 -- Correctly account for job arrays for new {Max/Grp}JobsAccrue limits.



* Changes in Slurm 17.11.11
===
 -- Correctly initialize variable in env_array_user_default().
 -- Correctly handle scenarios where a partitions MaxMemPerCPU is less than
a jobs --mem-per-cpu and also -c is greater than 1.




[slurm-users] Slurm versions 18.08.1 and 17.11.10 are now available

2018-10-04 Thread Tim Wickberg
We are pleased to announce the availability of Slurm versions 18.08.1 
and 17.11.10.


This includes an extensive set of fixes made since 18.08.0 was released 
at the end of August, and for 17.11.10 since 17.11.9 was released at the 
start of August.


Please note that the 17.11.10 release is expected to be the the last 
maintenance release of that series (barring any critical security 
issues) as our support team has shifted their attention to the 18.08 
release. Also note that support for 17.02 ended in August; SchedMD 
customers are encourage to upgrade to a supported major release (18.08 
or 17.11) at their earliest convenience.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 18.08.1
==
 -- Remove commented-out parts of man pages related to cons_tres work in 19.05,
as these were showing up on the web version due to a syntax error.
 -- Prevent slurmctld performance issues in main background loop if multiple
backup controllers are unavailable.
 -- Add missing user read association lock in burst_buffer/cray during init().
 -- Fix incorrect spacing for PartitionName lines in 'scontrol write config'.
 -- Fix creation of step hwloc xml file for after cpuset cgroup has been
created.
 -- Add userspace as a valid default governor.
 -- Add timers to group_cache_lookup so if going slow advise
LaunchParameters=send_gids.
 -- Fix SLURM_STEP_GRES=none to work correctly.
 -- Fix potential memory leak when a failure happens unpacking a ctld_multi_msg.
 -- Fix potential double free when a faulure happens when unpacking a
node_registration_status_msg.
 -- Fix sacctmgr show runaways.
 -- Removed non-POSIX append operator from configure script for non-bash
support.
 -- Fix incorrect spacing for PartitionName lines in 'scontrol write config'.
 -- Fix sacct to not print huge reserve times when the job was never eligible.
 -- burst_buffer/cray - Add missing locks around assoc_mgr when timing out a
burst buffer.
 -- burst_buffer/cray - Update burst buffers when an association or qos
is removed from the system.
 -- Remove documentation for deprecated Cray/ALPS systems. Please switch to
Native Cray mode instead.
 -- Completely copy features when copying the list in the slurmctld.
 -- PMIX - Fix issue with packing processes when using an arbitrary task
distribution.
 -- Fix hostlists to be able to handle nodenames with '-' in them surrounded
by integers.
 -- Added sort option to sprio output.
 -- Fix correct job CPU count allocated.
 -- Fix sacctmgr setting GrpJobs limit when setting GrpJobsAccrue limit.
 -- Change the defaults to MemLimitEnforce=no and NoOverMemoryKill
(See RELEASE_NOTES).
 -- Prevent abort when using Cray node features plugin on non-knl.
 -- Add ability to reboot down nodes with scontrol reboot_nodes.
 -- Protect against sending to the slurmdbd if the connection has gone away.
 -- Fix invalid read when not using backup slurmctlds.
 -- Prevent acct coordinators from changing default acct on add user.
 -- Don't allow scontrol top do modify job priorities when priority == 1.
 -- slurmsmwd - change parsing code to handle systems with the svid or inst
fields set in xtconsumer output.
 -- Fix infinite loop in slurmctld if GRES is specified without a count.
 -- sacct: Print error when unknown arguments are found.
 -- Fix checking missing return codes when unpacking structures.
 -- Fix slurm.spec-legacy including slurmsmwd
 -- More explicit error message when cgroup oom-kill events detected.
 -- When updating an association and are unable to find parent association
initialize old fairshare association pointer correctly.
 -- Wrap slurm_cond_signal() calls with mutexes where needed.
 -- Fix correct timeout with resends in slurm_send_only_node_msg.
 -- Fix pam_slurm_adopt to honor action_adopt_failure.
 -- Have the slurmd recreate the hwloc xml file for the full system on restart.
 -- sdiag - correct the units for the gettimeofday() stat to microseconds.
 -- Set SLURM_CLUSTER_NAME environment variable in MailProg to the ClusterName.
 -- smail - use SLURM_CLUSTER_NAME environment variable.
 -- job_submit/lua - expose argc/argv options through lua interface.
 -- slurmdbd - prevent false-positive warning about innodb settings having
been set too low if they're actually set over 2GB.

* Changes in Slurm 17.11.10
===
 -- Move priority_sort_part_tier from slurmctld to libslurm to make it possible
to run the regression tests 24.* without changing that code since it links
directly to the priority plugin where that function isn't defined.
 -- Fix issue where job time limits can increase to max walltime when updating
a job with scontrol.
 -- Fix invalid protocol_version manipulation on big endian platforms causing
srun and sattach to fail.
 -- Fix for QOS

[slurm-users] Slurm User Group Meeting 2018 Agenda is online

2018-09-10 Thread Tim Wickberg

Just a quick note to mention that the SLUG'18 agenda has been posted online:

https://slurm.schedmd.com/slurm_ug_agenda.html



[slurm-users] Slurm release candidate version 18.08.0rc1 available for testing

2018-08-21 Thread Tim Wickberg
We are pleased to announce the availability of Slurm release candidate 
version 18.08.0rc1.


This is the first release candidate version of the upcoming 18.08 
release series, and represents the end of development for the release 
cycle, and a finalization of the RPC and state file formats.


If any issues are identified with this new release candidate, please 
report them through https://bugs.schedmd.com against the 18.08.x version 
and we will address them before the first production 18.08.0 release is 
made.


Please note that the release versions are not intended for production 
use. Barring any late-discovered issues, the state file formats should 
not change between now and 18.08.0 and are considered frozen at this 
time for the 18.08 release.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 17.11.9 available

2018-08-09 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 17.11.9.

This includes 10 fixes made since 17.11.8 was released last month, 
including a fix to prevent hung srun processes that can manifest during 
massively parallel jobs.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 17.11.9
==
 -- Fix segfault in slurmctld when a job's node bitmap is NULL during a
scheduling cycle.  Primarily caused by EnforcePartLimits=ALL.
 -- Remove erroneous unlock in acct_gather_energy/ipmi.
 -- Enable support for hwloc version 2.0.1.
 -- Fix 'srun -q' (--qos) option handling.
 -- Fix socket communication issue that can lead to lost task completion
messages, which will cause a permanently stuck srun process.
 -- Handle creation of TMPDIR if environment variable is set or changed in
a task prolog script.
 -- Avoid node layout fragmentation if running with a fixed CPU count but
without Sockets and CoresPerSocket defined.
 -- burst_buffer/cray - Fix datawarp swap default pool overriding jobdw.
 -- Fix incorrect job priority assignment for multi-partition job with
different PriorityTier settings on the partitions.
 -- Fix sinfo to print correct node state




[slurm-users] Slurm pre-release version 18.08.0pre2 available for testing

2018-08-02 Thread Tim Wickberg
We are pleased to announce the availability of Slurm pre-release version 
18.08.0pre2.


This is the second pre-release version of the upcoming 18.08 release 
series, and represents a working snapshot of recent developments. 
Interested parties are encouraged to test this out ahead of the RPC, 
ABI, and state file format freezes that will occur when the first 
release candidate is made available later next week.


If any issues are identified with this new pre-release, please report 
them through https://bugs.schedmd.com against the 18.08.x version and we 
will try to address them before the first release candidate is made 
available.


Please note that the pre-release versions are not indented for 
production use, and that the state file formats are still subject to 
change. As such, you should not expect to safely transition between the 
pre-release versions and the eventual release. This is intended for 
local testing, especially of unusual system configurations, ahead of the 
RPC freeze and the first production 18.08.0 release.


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



[slurm-users] Slurm version 17.11.8 available

2018-07-19 Thread Tim Wickberg

We are pleased to announce the availability of Slurm version 17.11.8.

This includes over 30 fixes made since 17.11.7 was released at the end 
of May. This includes a change to the slurmd.service file used with 
systemd, this fix prevents systemd from destroying the cgroup 
hierarchies slurmd/slurmstepd have created whenever 'systemctl 
daemon-reload' is called (e.g., by yum/rpm).


Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support


* Changes in Slurm 17.11.8
==
 -- Fix incomplete RESPONSE_[RESOURCE|JOB_PACK]_ALLOCATION building path.
 -- Do not allocate nodes that were marked down due to the node not responding
by ResumeTimeout.
 -- task/cray plugin - search for "mems" cgroup information in the file
"cpuset.mems" then fall back to the file "mems".
 -- Fix ipmi profile debug uninitialized variable.
 -- Improve detection of Lua package on older RHEL distributions.
 -- PMIx: fixed the direct connect inline msg sending.
 -- MYSQL: Fix issue not handling all fields when loading an archive dump.
 -- Allow a job_submit plugin to change the admin_comment field during
job_submit_plugin_modify().
 -- job_submit/lua - fix access into reservation table.
 -- MySQL - Prevent deadlock caused by archive logic locking reads.
 -- Don't enforce MaxQueryTimeRange when requesting specific jobs.
 -- Modify --test-only logic to properly support jobs submitted to more than
one partition.
 -- Prevent slurmctld from abort when attempting to set non-existing
qos as def_qos_id.
 -- Add new job dependency type of "afterburstbuffer". The pending job will be
delayed until the first job completes execution and it's burst buffer
stage-out is completed.
 -- Reorder proctrack/task plugin load in the slurmstepd to match that of slurmd
and avoid race condition calling task before proctrack can introduce.
 -- Prevent reboot of a busy KNL node when requesting inactive features.
 -- Revert to previous behavior when requesting memory per cpu/node introduced
in 17.11.7.
 -- Fix to reinitialize previously adjusted job members to their original value
when validating the job memory in multi-partition requests.
 -- Fix _step_signal() from always returning SLURM_SUCCESS.
 -- Combine active and available node feature change logs on one line rather
than one line per node for performance reasons.
 -- Prevent occasionally leaking freezer cgroups.
 -- Fix potential segfault when closing the mpi/pmi2 plugin.
 -- Fix issues with --exclusive=[user|mcs] to work correctly
with preemption or when job requests a specific list of hosts.
 -- Make code compile with hdf5 1.10.2+
 -- mpi/pmix: Fixed the collectives canceling.
 -- SlurmDBD: improve error message handling on archive load failure.
 -- Fix incorrect locking when deleting reservations.
 -- Fix incorrect locking when setting up the power save module.
 -- Fix setting format output length for squeue when showing array jobs.
 -- Add xstrstr function.
 -- Fix printing out of --hint options in sbatch, salloc --help.
 -- Prevent possible divide by zero in _validate_time_limit().
 -- Add Delegate=yes to the slurmd.service file to prevent systemd from
interfering with the jobs' cgroup hierarchies.




  1   2   >