We are pleased to announce the release of Slurm version 17.02.4, which
contains about 40 fixes developed over the past month.

Slurm can be downloaded from https://www.schedmd.com/downloads.php

* Changes in Slurm 17.02.4
==========================
-- Do not attempt to schedule jobs after changing the power cap if there are
    already many active threads.
 -- Job expansion example in FAQ enhanced to demonstrate operation in
    heterogeneous environments.
-- Prevent scontrol crash when operating on array and no-array jobs at once.
 -- knl_cray plugin: Log incomplete capmc output for a node.
-- knl_cray plugin: Change capmc parsing of mcdram_pct from string to number.
 -- Remove log files from test20.12.
 -- When rebooting a node and using the PrologFlags=alloc make sure the
    prolog is ran after the reboot.
-- node_features/knl_generic - If a node is rebooted for a pending job, but fails to enter the desired NUMA and/or MCDRAM mode then drain the node and
    requeue the job.
 -- node_features/knl_generic disable mode change unless RebootProgram
    configured.
 -- Add new burst_buffer function bb_g_job_revoke_alloc() to be executed
    if there was a failure after the initial resource allocation. Does not
    release previously allocated resources.
-- Test if the node_bitmap on a job is NULL when testing if the job's nodes
    are ready.  This will be NULL is a job was revoked while beginning.
-- Fix incorrect lock levels when testing when job will run or updating a job.
 -- Add missing locks to job_submit/pbs plugin when updating a jobs
    dependencies.
 -- Add support for lua5.3
-- Add min_memory_per_node|cpu to the job_submit/lua plugin to deal with lua not being able to deal with pn_min_memory being a uint64_t. Scripts are
    urged to change to these new variables avoid issue.  If not set the
    variables will be 'nil'.
 -- Calculate priority correctly when 'nice' is given.
 -- Fix minor typos in the documentation.
 -- node_features/knl_cray: Preserve non-KNL active features if slurmctld
    reconfigured while node boot in progress.
-- node_features/knl_generic: Do not repeatedly log errors when trying to read
    KNL modes if not KNL system.
 -- Add missing QOS read lock to backfill scheduler.
-- When doing a dlopen on liblua only attempt the version compiled against. -- Fix null-dereference in sreport cluster ulitization when configured with
    memory-leak-debug.
-- Fix Partition info in 'scontrol show node'. Previously duplicate partition
    names, or Partitions the node did not belong to could be displayed.
 -- Fix it so the backup slurmdbd will take control correctly.
-- Fix unsafe use of MAX() macro, which could result in problems cleaning up
    accounting plugins in slurmd, or repeat job cancellation attempts in
    scancel.
-- Fix 'scontrol update reservation duration=unlimited' to set the duration
    to 365-days (as is done elsewhere), rather than 49710 days.
 -- Check if variable given to scontrol show job is a valid jobid.
 -- Fix WithSubAccounts option to not include WithDeleted unless requested.
 -- Prevent a job tested on multiple partitions from being marked
    WHOLE_NODE_USER.
 -- Prevent a race between completing jobs on a user-exclusive node from
    leaving the node owned.
-- When scheduling take the nodes in completing jobs out of the mix to reduce
    fragmentation.  SchedulerParameters=reduce_completing_frag
-- For jobs submited to multiple partitions, report the job's earliest start
    time for any partition.
 -- Backfill partitions that use QOS Grp limits to "float" better.
 -- node_features/knl_cray: don't clear configured GRES from non-KNL node.
 -- sacctmgr - prevent segfault in command when a request is denied due
    to a insufficient priviledges.
 -- Add warning about libcurl-devel not being installed during configure.
 -- Streamline job purge by handling file deletion on a separate thread.
 -- Always set RLIMIT_CORE to the maximum permitted for slurmd, to ensure
    core files are created even on non-developer builds.
 -- Fix --ntasks-per-core option/environment variable parsing to set
    the requested value, instead of always setting one.
-- If trying to cancel a step that hasn't started yet for some reason return
    a good return code.
 -- Fix issue with sacctmgr show where user=''

Reply via email to