There were changes needed to support the latest libtools, but those
changes introduced a bug in setting version numbers. We'll get this
corrected in v14.03.2.
Quoting Alan Orth <alan.o...@gmail.com>:
I updated to version 14.03.1 but CLI invocation of any SLURM tool still
shows version 14.03.0:
# squeue -V
slurm 14.03.0
# rpm -qf `which squeue`
slurm-14.03.1-1.el6.x86_64
Sorry if this was covered before... thanks!
Alan
On 04/21/2014 11:20 PM, je...@schedmd.com wrote:
Slurm version 14.03.1 is now available with four weeks worth of bug
fixes as described below. You can download the Slurm from:
http://www.schedmd.com/#repos
* Changes in Slurm 14.03.1
==========================
-- Add support for job std_in, std_out and std_err fields in Perl API.
-- Add "Scheduling Configuration Guide" web page.
-- BGQ - fix check for jobinfo when it is NULL
-- Do not check cleaning on "pending" steps.
-- task/cgroup plugin - Fix for building on older hwloc (v1.0.2).
-- In the PMI implementation by default don't check for duplicate keys.
Set the SLURM_PMI_KVS_DUP_KEYS if you want the code to check for
duplicate keys.
-- Add job submission time to squeue.
-- Permit user root to propagate resource limits higher than the hard
limit
slurmd has on that compute node has (i.e. raise both current and
maximum
limits).
-- Fix issue with license used count when doing an scontrol reconfig.
-- Fix the PMI iterator to not report duplicated keys.
-- Fix issue with sinfo when -o is used without the %P option.
-- Rather than immediately invoking an execution of the scheduling
logic on
every event type that can enable the execution of a new job, queue
its
execution. This permits faster execution of some operations, such as
modifying large counts of jobs, by executing the scheduling logic
less
frequently, but still in a timely fashion.
-- If the environment variable is greater than MAX_ENV_STRLEN don't
set it in the job env otherwise the exec() fails.
-- Optimize scontrol hold/release logic for job arrays.
-- Modify srun to report an exit code of zero rather than nine if
some tasks
exit with a return code of zero and others are killed with
SIGKILL. Only an
exit code of zero did this.
-- Fix a typo in scontrol man page.
-- Avoid slurmctld crash getting job info if detail_ptr is NULL.
-- Fix sacctmgr add user where both defaultaccount and accounts are
specified.
-- Added SchedulerParameters option of max_sched_time to limit how
long the
main scheduling loop can execute for.
-- Added SchedulerParameters option of sched_interval to control how
frequently
the main scheduling loop will execute.
-- Move start time of main scheduling loop timeout after locks are
aquired.
-- Add squeue job format option of "%y" to print a job's nice value.
-- Update scontrol update jobID logic to operate on entire job arrays.
-- Fix PrologFlags=Alloc to run the prolog on each of the nodes in the
allocation instead of just the first.
-- Fix race condition if a step is starting while the slurmd is being
restarted.
-- Make sure a job's prolog has ran before starting a step.
-- BGQ - Fix invalid memory read when using DefaultConnType in the
bluegene.conf
-- Make sure we send node state to the DBD on clean start of controller.
-- Fix some sinfo and squeue sorting anomalies due to differences in
data
types.
-- Only send message back to slurmctld when PrologFlags=Alloc is used
on a
Cray/ALPS system, otherwise use the slurmd to wait on the prolog
to gate
the start of the step.
-- Remove need to check PrologFlags=Alloc in slurmd since we can tell
if prolog
has ran yet or not.
-- Fix squeue to use a correct macro to check job state.
-- BGQ - Fix incorrect logic issues if MaxBlockInError=0 in the
bluegene.conf.
-- priority/basic - Insure job priorities continue to decrease when
jobs are
submitted with the --nice option.
-- Make the PrologFlag=Alloc work on batch scripts
-- Make PrologFlag=NoHold (automatically sets PrologFlag=Alloc) not
hold in
salloc/srun, instead wait in the slurmd when a step hits a node
and the
prolog is still running.
-- Added --cpu-freq=highm1 (high minus one) option.
-- Expand StdIn/Out/Err string length output by "scontrol show job"
from 128
to 1024 bytes.
-- squeue %F format will now print the job ID for non-array jobs.
-- Use quicksort for all priority based job sorting, which improves
performance
significantly with large job counts.
-- If a job has already been released from a held state ignore
successive
release requests.
-- Fix srun/salloc/sbatch man pages for the --no-kill option.
-- Add squeue -L/--licenses option to filter jobs by license names.
-- Handle abort job on node on front end systems without core dumping.
-- Fix dependency support for job arrays.
-- When updating jobs verify the update request is not identical to
the current settings.
-- When sorting jobs and priorities are equal sort by job_id.
-- Do not overwrite existing reason for node being down or drained.
-- Requeue batch job if Munge is down and credential can not be created.
-- Make _slurm_init_msg_engine() tolerate bug in bind() returning a busy
ephemeral port.
-- Don't block scheduling of entire job array if it could run in
multiple
partitions.
-- Introduce a new debug flag Protocol to print protocol requests
received
together with the remote IP address and port.
-- CRAY - Set up the network even when only using 1 node.
-- CRAY - Greatly reduce the number of error messages produced from
the task
plugin and provide more information in the message.
--
Alan Orth
alan.o...@gmail.com
http://alaninkenya.org
http://mjanja.co.ke
"I have always wished for my computer to be as easy to use as my
telephone; my wish has come true because I can no longer figure out
how to use my telephone." -Bjarne Stroustrup, inventor of C++
GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0