Yes, we have this set up here. Here is an example:
# Serial Requeue
PartitionName=serial_requeue Priority=1 \
PreemptMode=REQUEUE MaxTime=7-0 Default=YES MaxNodes=1 \
AllowGroups=cluster_users \
Nodes=blah
# Priority
PartitionName=priority Priority=10 \
AllowGroups=important_people \
Nodes=blah
# JOB PREEMPTION
PreemptType=preempt/partition_prio
PreemptMode=REQUEUE
Since serial_requeue is the lowest priority it gets scheduled last and
if any jobs come in from the higher priority queue it requeues the lower
priority jobs.
-Paul Edmon-
On 10/20/2014 02:03 PM, je...@schedmd.com wrote:
This should help:
http://slurm.schedmd.com/preempt.html
Quoting Mikael Johansson <mikael.johans...@iki.fi>:
Hello All,
I've been scratching my head for a while now trying to figure this
one out, which I would think would be a rather common setup.
I would need to set up a partition (or whatever, maybe a partition is
actually not the way to go) with the following properties:
1. If there are any unused cores on the cluster, jobs submitted to this
one would use them, and immediately have access to them.
2. The jobs should only use these resources until _any_ other job in
another partition needs them. In this case, the jobs should be
preempted and requeued.
So this should be some sort of "shadow" queue/partition, that
shouldn't affect the scheduling of other jobs on the cluster, but
just use up any free resources that momentarily happen to be
available. So SLURM should just continue scheduling everything else
normally, and treat the cores used by this shadow queue as free
resources, and then just immediately cancel and requeue any jobs
there, when a "real" job starts.
If anyone has something like this set up, example configs would be
very welcome, as of course all other suggestions and ideas.
Cheers,
Mikael J.
http://www.iki.fi/~mpjohans/