[slurm-dev] AssocGrp*Limits being considered for scheduling

2016-02-23 Thread Lucas Gabriel Vuotto


Hello,

we want to know if there is a "built-in" solution for the situation we have:

We have an special account A in sacctmgr which gives some users more cpu 
minutes to use monthly. Also, we use the multifactor priority plugin to 
decide which jobs start first. Right now, there are some jobs from 
account A that can't start because the extra resources were consumed, so 
until march, 1st they won't start. Still, there are other jobs enqueued 
that have less priority than the ones from account A, so they're not 
starting because the scheduler still consider the jobs from account A to 
be able to schedule, assigning them a StartTime from today.


Basically, what we want to know is if there is some option/plugin to either:

  1. delay the StartTime from jobs that can't start because of 
AssocGrp*Limits

  2. turn priority to 0 for that jobs until the next month
  3. any other idea which can have the desire effect (run jobs that can 
actually run this month this month)


Ideally, we want to know if there is some solution from slurm itself and 
not running cron jobs every 10 minutes to do option 1 manually, which is 
the only idea we have right now (better ideas are welcome, though).


Cheers & thanks!


-- lv.


[slurm-dev] allocating entire nodes from within an allocation

2016-02-23 Thread Craig Yoshioka
Hi,

Sorry if this has been answered before, I did not find an answer in a search.

Is there a way to have srun use entire nodes in a multi-node allocation?

If I run:

$ salloc -N 2 —exclusive

then run:

$ srun -N 1 -n 1 —exclusive sleep 10 &
$ srun -N 1 -n 1 —exclusive sleep 10 &
$ srun -N 1 -n 1 —exclusive sleep 10 &
$ srun -N 1 -n 1 —exclusive sleep 10 &
$ wait

It doesn’t seem to block like I’d expect, the total wait is 10 seconds.  I see 
that the meaning of —exclusive is different in the two environments, is there a 
way to get the original behavior from within an allocation? 


Thanks,
-Craig

[slurm-dev] Re: AssocGrp*Limits being considered for scheduling

2016-02-23 Thread Ryan Cox


Coincidentally, I asked about that yesterday in a bug report: 
http://bugs.schedmd.com/show_bug.cgi?id=2465. The short answer is to use 
SchedulerParameters=assoc_limit_continue that was introduced in 
15.08.8.  It only works if the Reason for the job is something like 
Assoc*Limit.


Ryan

On 02/23/2016 10:58 AM, Lucas Gabriel Vuotto wrote:


Hello,

we want to know if there is a "built-in" solution for the situation we 
have:


We have an special account A in sacctmgr which gives some users more 
cpu minutes to use monthly. Also, we use the multifactor priority 
plugin to decide which jobs start first. Right now, there are some 
jobs from account A that can't start because the extra resources were 
consumed, so until march, 1st they won't start. Still, there are other 
jobs enqueued that have less priority than the ones from account A, so 
they're not starting because the scheduler still consider the jobs 
from account A to be able to schedule, assigning them a StartTime from 
today.


Basically, what we want to know is if there is some option/plugin to 
either:


  1. delay the StartTime from jobs that can't start because of 
AssocGrp*Limits

  2. turn priority to 0 for that jobs until the next month
  3. any other idea which can have the desire effect (run jobs that 
can actually run this month this month)


Ideally, we want to know if there is some solution from slurm itself 
and not running cron jobs every 10 minutes to do option 1 manually, 
which is the only idea we have right now (better ideas are welcome, 
though).


Cheers & thanks!


-- lv.