Hi Loris,

Slurm main scheduler algorithm is just based on priority. It starts from
the highest priority job and keep trying through the waiting queue as
long as the job can be executed. Once a job can not be executed the
algorithm finishes. If you have more than one partition, the scheduler
will try then just for jobs launched to the other partitions. And so on.

If you have configured backfilling jobs have another chance to be
scheduled. In this algorithm the priority queue is traversed and most
priority jobs than can not be executed got some sort of reservations for
those resources needed. But the algorithm keeps going. Another lower
priority job could be executed if there are resources available and its
execution does not impact on previous higher priority jobs. For this the
jobs timelimit is critical. A job with a short timelimit could easily
use "holes" created by the scheduler.

So in your case I dare to say (assuming you have backfilling configured)
that jobs from user C can not be executed due to the long timelimit.
However, there are other parameters worth to take a look at. Backfilling
algorithm is really time consuming so it does not traverse the full
queue. This is configurable so maybe you could tune this value for your
system. Another interesting parameter is to assign the flag no_reserve
to a specific qos. It does mean jobs from that qos will not get a
reservation when the backfilling algorithm process them. This is an
important decission to take as it could lead to lower priority jobs
overtaking higher priority ones so use it with care. We are using this
flag in a system with peaks of tenths of thousand jobs.

I hope this can help you.

On 02/05/2014 11:42 AM, Loris Bennett wrote:
> "Loris Bennett" <loris.benn...@fu-berlin.de>
> writes:
>
>   
>> We do already use weighting, but my understanding was that this would
>> only affect the order in which resources are assigned and not prevent a
>> job from starting even when resources are available.
>>
>> I assume that there is some valid reason for a job waiting, but it is
>> not apparent to me.  I guess it would be helpful if it were possible to
>> see exactly what resources a job is waiting for, but I haven't come
>> across a way to do that.
>>     
> The situation of jobs not starting despite resources being available
> has occurred again.
>
> - User A with the highest priority jobs has reached her running job
>   limit, so no more of her jobs can start.  Her jobs have a time limit
>   of 2 days.
>
> - User B with the next highest priority jobs needs more memory than is
>   available on the free node, so his job can't start there.  His jobs
>   have a time limit of 3 days.
>
> - User C is next in line and needs all the CPUs of the node, but very
>   little memory.  It seems that his job should start, but it doesn't.
>   His jobs have a time limit of 3 days.
>
> Should User B's job prevent User C's job from starting?  Or is it
> because User C's time limit is greater than that of User A?  I can sort
> of see why a lower priority job with a long run-time maybe shouldn't
> start before a higher priority, short run-time job which is being held
> back due to the running job limit, but is this really what is going on?
>
> Regards
>
> Loris
>  
>   


WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer

Reply via email to