On our system, for some partitions, we guarantee that a job can run at least an 
hour before being preempted by a higher priority job.  We use the QOS preempt 
exempt time for this, and it appears to be working.  But of course, I want to 
TEST that it works.

So on a test system, I start a lower priority job on a specific node, wait 
until it starts running, and then I start a higher priority job for the same 
node.  The test should only pass if the higher priority job has an OPPORTUNITY 
to preempt the lower priority job, and doesn't.

Now, I know I can get a preempt eligible time out of scontrol for the lower 
priority job and verify that it's set for an hour (I do check that already), 
but that's not good enough for me.  I could obviously let the test run for an 
hour to verify the lower priority job was never preempted...but that's not 
really feasible.  So instead, I want to verify that the higher priority job has 
had a chance to preempt the lower priority job, and it did not.

So far, the way I've been doing that is to check the reported Scheduler in the 
scontrol job output for the higher priority job.  I figure that when the 
scheduler changes to Backfill instead of Main, then the higher priority job has 
been seen by the main scheduler and it passed on the chance to preempt the 
lower priority job.

Is that a good assumption?  Is there any other, or potentially quicker, way to 
verify that the higher priority job will NOT preempt the lower priority job?

Rob

Reply via email to