Hi ,
I like to know more about the backfill algorithm of slurm. How
bf_resolution and bf_window is used in it.
I have been searching in mailing list archive and saw many of the people
having issue of changing the start time of jobs or starvation of jobs which can
solve by tuning of bf_w
Hello,
Some modifications to the slurm.conf require me to restart the slurmd
daemons on all nodes. Is there a way to do this without loosing any
running jobs (and not having to drain the cluster)?
Thanks,
Robbert
--
Robbert Eggermont Intelligent Systems
r.e
You should be able to do this with out losing any jobs (at least I've
never lost any on any version of Slurm I have run). I do it all the
time in our environment (about once a day) as our slurm.conf is in flux
quite a bit. It should always preserve the running and pending state.
The only i
While this is true be very, very careful when restarting the slurmd on
the controller node.
it's quite easy to miss a typo in one of the config files, e.g. an
unexpected comma in topology.conf which can cause slurm to segfault or
otherwise shut-down uncleanly. If this happens then the state of
I've had this happen several times, but have never lost jobs due to it.
Still one should always watch the logs on the master when restarting so
you can catch typos immediately.
We run a sanity check on our conf's before we push them (we use puppet
for configuration control). Our post commi
I've only ever had this happen once but it's murphy's law that it didn't
happen on the test system but on the system in production and I was just a
minute or so too slow finding the error.
Antony
On 12 Oct 2015 18:25, "Paul Edmon" wrote:
>
> I've had this happen several times, but have never los
Hello,
Our GRID cluster has so far been using slurm version 14.11.4 on an el6
system and I wish to upgrade it to the newest version. The current cluster
includes a master node (which is also a job execution node) and three
other job execution node.
I performed the upgrade using (with RPM built w
Can you post:
sacctmgr show config
Do you get more info if you run slurmdbd in debug mode?
slurmdbd -D
Cheers,
Barbara
On 10/12/2015 09:21 PM, gasper.ku...@ung.si wrote:
Hello,
Our GRID cluster has so far been using slurm version 14.11.4 on an el6
system and I wish to upgrade it to t