What you are describing is definitely doable.  We have our system setup 
similarly.  All nodes are in the "open" partition and "prio" partition, but a 
job submitted to the "prio" partition will preempt the open jobs.

I don't see anything clearly wrong with your slurm.conf settings.  Ours are 
very similar, though we use only FORCE:1 for oversubscribe.  You might try that 
just to see if there's a difference.

What are the sbatch settings you are using when you submit the jobs?

Do you have PreemptExemptTime set to anything in slurm.conf?

What is the reason squeue gives for the high priority jobs to be pending?

For your "run regularly" goal, you might consider scrontab.  If we can figure 
out priority and preemption, then that will start the job at a regular time.

Rob

________________________________
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Fabrizio 
Roccato <f.rocc...@isac.cnr.it>
Sent: Wednesday, May 24, 2023 7:17 AM
To: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
Subject: [slurm-users] hi-priority partition and preemption

[You don't often get email from f.rocc...@isac.cnr.it. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Hi all,
        i'm trying to have two overlapping partition, say normal and hi-pri,
so that when jobs are launched in the second one they can preempt the jobs
allready running in the first one, automatically putting them in suspend
state. After completition, the jobs in the normal partition must be
automatically resumed.

here are my (relevant) slurm.conf settings:

> PreemptMode=suspend,gang
> PreemptType=preempt/partition_prio
>
> PartitionName=normal Nodes=node0[01-08] MaxTime=1800 PriorityTier=100 
> AllowAccounts=group1,group2 OverSubscribe=FORCE:20 PreemptMode=suspend
> PartitionName=hi-pri Nodes=node0[01-08] MaxTime=360 PriorityTier=500 
> AllowAccounts=group2 OverSubscribe=FORCE:20 PreemptMode=off

But so, jobs in the hi-pri partition where put in PD state and the ones
allready running in the normal partition continue in their R status.
What  i'm wrong? What i'm missing?

Since i have jobs thath must run at specific time and must have priority over
all others, is this the correct way to do?


Thanks

FR

Reply via email to