We are pleased to announce the availability of Slurm versions 15.08.11 and 16.05.0-rc1 (release candidate 1).

15.08.11 contains around 25 rather minor bug fixes, detailed below. Please upgrade at your leisure.

The rc release contains all of the features intended for release 16.05. Development has ended for this release and we are continuing with our testing phase which will most likely result in another rc before we tag 16.05.0 near the end of the month. A description of what this release contains is in the RELEASE_NOTES file available in the source. Your help in hardening this version is greatly appreciated. You are invited to download this version and assist in testing.

Slurm downloads are available from http://schedmd.com/#repos

* Changes in Slurm 15.08.11
===========================
-- Fix for job "--contiguous" option that could cause job allocation/launch
    failure or slurmctld crash.
 -- Fix to setup logs for single-character program names correctly.
-- Backfill scheduling performance enhancement with large number of running
    jobs.
 -- Reset job's prolog_running counter on slurmctld restart or reconfigure.
-- burst_buffer/cray - Update job's prolog_running counter if pre_run fails. -- MYSQL - Make the error message more specific when removing a reservation
    and it doesn't meet basic requirements.
-- burst_buffer/cray - Fix for script creating or deleting persistent buffer
    would fail "paths" operation and hold the job.
 -- power/cray - Prevent possible divide by zero.
-- power/cray - Fix bug introduced in 15.08.10 preventing operation in many
    cases.
-- Prevent deadlock for flow of data to the slurmdbd when sending reservation
    that wasn't set up correctly.
-- burst_buffer/cray - Don't call Datawarp "paths" function if script includes only create or destroy of persistent burst buffer. Some versions of Datawarp
    software return an error for such scripts, causing the job to be held.
 -- Fix potential issue when adding and removing TRES which could result
    in the slurmdbd segfaulting.
 -- Add cast to memory limit calculation to prevent integer overflow for
    very large memory values.
 -- Bluegene - Fix issue with reservations resizing under the covers on a
    restart of the slurmctld.
-- Avoid error message of "Requested cpu_bind option requires entire node to
    be allocated; disabling affinity" being generated in some cases where
    task/affinity and task/cgroup plugins used together.
-- Fix version issue when packing GRES information between 2 different versions
    of Slurm.
 -- Fix for srun hanging with OpenMPI and PMIx
 -- Better initialization of node_ptr when dealing with protocol_version.
 -- Fix incorrect type when initializing header of a message.
 -- MYSQL - Fix incorrect usage of limit and union.
 -- MYSQL - Remove 'ignore' from alter ignore when updating a table.
 -- Documentation - update prolog_epilog page to reflect current behavior
    if the Prolog fails.
 -- Documentation - clarify behavior of 'srun --export=NONE' in man page.
 -- Fix potential gres underflow on restart of slurmctld.
 -- Fix sacctmgr to remove a user who has no associations.

* Changes in Slurm 16.05.0rc1
==============================
-- Remove the SchedulerParameters option of "assoc_limit_continue", making it the default value. Add option of "assoc_limit_stop". If "assoc_limit_stop" is set and a job cannot start due to association limits, then do not attempt
    to initiate any lower priority jobs in that partition. Setting this can
decrease system throughput and utlization, but avoid potentially starving
    larger jobs by preventing them from launching indefinitely.
-- Update a node's socket and cores per socket counts as needed after a node boot to reflect configuration changes which can occur on KNL processors. Note that the node's total core count must not change, only the distribution of cores across varying socket counts (KNL NUMA nodes treated as sockets by
    Slurm).
 -- Rename partition configuration from "Shared" to "OverSubscribe". Rename
salloc, sbatch, srun option from "--shared" to "--oversubscribe". The old
    options will continue to function. Output field names also changed in
    scontrol, sinfo, squeue and sview.
 -- Add SLURM_UMASK environment variable to user job.
 -- knl_conf: Added new configuration parameter of CapmcPollFreq.
 -- squeue: remove errant spaces in column formats for "squeue -o %all".
 -- Add ARRAY_TASKS mail option to send emails to each task in a job array.
 -- Change default compression library for sbcast to lz4.
-- select/cray - Initiate step node health check at start of step termination
    rather than after application completely ends so that NHC can capture
    information about hung (non-killable) processes.
-- Add --units=[KMGTP] option to sacct to display values in specific unit type.
 -- Modify sacct and sacctmgr to display TRES values in converted units.
 -- Modify sacctmgr to accept TRES values with [KMGTP] suffixes.
 -- Replace hash function with more modern SipHash functions.
 -- Add "--with-cray_dir" build/configure option.
 -- BB- Only send stage_out email when stage_out is set in script.
 -- Add r/w locking to file_bcast receive functions in slurmd.
 -- Add TopologyParam option of "TopoOptional" to optimize network topology
    only for jobs requesting it.
 -- Fix build on FreeBSD.
-- Configuration parameter "CpuFreqDef" used to set default governor for job
    step not specifying --cpu-freq (previously the parameter was unused).
 -- Fix sshare -o<format> to correctly display new lengths.
 -- Update documentation to rename Shared option to OverSubscribe.
-- Update documentation to rename partition Priority option to PriorityTier.
 -- Prevent changing of QOS on running jobs.
 -- Update accounting when changing QOS on pending jobs.
 -- Add support to ntasks_per_socket in task/affinity.
 -- Generate init.d and systemd service scripts in etc/ through Make rather
    than at configure time to ensure all variable substitutions happen.
 -- Use TaskPluginParam for default task binding if no user specified CPU
    binding. User --cpu_bind option takes precident over default. No longer
    any error if user --cpu_bind option does not match TaskPluginParam.
 -- Make sacct and sattach work with older slurmd versions.
-- Fix protocol handling between 15.08 and 16.05 for 'scontrol show config'.
 -- Enable prefixes (e.g. info, debug, etc.) in slurmstepd debugging.

Reply via email to