Re: [slurm-users] job priority keeping resources from being used?

2019-11-05 Thread c b
On Sun, Nov 3, 2019 at 7:18 AM Juergen Salk  wrote:

>
> Hi,
>
> maybe I missed it, but what does squeue say in the reason field for
> your pending jobs that you expect to slip in?
>
>
the reason on these jobs is just "Priority".




> Is your partition maybe configured for exclusive node access, e.g. by
> setting `OverSubscribe=EXCLUSIVE´?
>
>
We don't have that setting on and i believe we are not configured for
exclusive node access otherwise.  When my small jobs that each require one
core are running, we get as many jobs as we have cores running
simultaneously on each machine.

thanks


> Best regards
> Jürgen
>
> --
> Jürgen Salk
> Scientific Software & Compute Services (SSCS)
> Kommunikations- und Informationszentrum (kiz)
> Universität Ulm
> Telefon: +49 (0)731 50-22478
> Telefax: +49 (0)731 50-22471
>
>
> * c b  [191101 14:44]:
> > I see - yes, to clarify, we are specifying memory for each of these jobs,
> > and there is enough memory on the nodes for both types of jobs to be
> > running simultaneously.
> >
> > On Fri, Nov 1, 2019 at 1:59 PM Brian Andrus  wrote:
> >
> > > I ask if you are specifying it, because if not, slurm will assume a job
> > > will use all the memory available.
> > >
> > > So without specifying, your big job gets allocated 100% of the memory
> so
> > > nothing could be sent to the node. Same if you don't specify for the
> little
> > > jobs. It would want 100%, but if anything is running there, 100% is not
> > > available as far as slurm is concerned.
> > >
> > > Brian
> > > On 11/1/2019 10:52 AM, c b wrote:
> > >
> > > yes, there is enough memory for each of these jobs, and there is enough
> > > memory to run the high resource and low resource jobs at the same time.
> > >
> > > On Fri, Nov 1, 2019 at 1:37 PM Brian Andrus 
> wrote:
> > >
> > >> Are you specifying memory for each of the jobs?
> > >>
> > >> Can't run a small job if there isn't enough memory available for it.
> > >>
> > >> Brian Andrus
> > >> On 11/1/2019 7:42 AM, c b wrote:
> > >>
> > >> I have:
> > >> SelectType=select/cons_res
> > >> SelectTypeParameters=CR_CPU_Memory
> > >>
> > >> On Fri, Nov 1, 2019 at 10:39 AM Mark Hahn  wrote:
> > >>
> > >>> > In theory, these small jobs could slip in and run alongside the
> large
> > >>> jobs,
> > >>>
> > >>> what are your SelectType and SelectTypeParameters settings?
> > >>> ExclusiveUser=YES on partitions?
> > >>>
> > >>> regards, mark hahn.
> > >>>
> > >>>
>
>
>


Re: [slurm-users] job priority keeping resources from being used?

2019-11-03 Thread Juergen Salk


Hi,

maybe I missed it, but what does squeue say in the reason field for 
your pending jobs that you expect to slip in?

Is your partition maybe configured for exclusive node access, e.g. by
setting `OverSubscribe=EXCLUSIVE´?

Best regards
Jürgen

-- 
Jürgen Salk
Scientific Software & Compute Services (SSCS)
Kommunikations- und Informationszentrum (kiz)
Universität Ulm
Telefon: +49 (0)731 50-22478
Telefax: +49 (0)731 50-22471


* c b  [191101 14:44]:
> I see - yes, to clarify, we are specifying memory for each of these jobs,
> and there is enough memory on the nodes for both types of jobs to be
> running simultaneously.
> 
> On Fri, Nov 1, 2019 at 1:59 PM Brian Andrus  wrote:
> 
> > I ask if you are specifying it, because if not, slurm will assume a job
> > will use all the memory available.
> >
> > So without specifying, your big job gets allocated 100% of the memory so
> > nothing could be sent to the node. Same if you don't specify for the little
> > jobs. It would want 100%, but if anything is running there, 100% is not
> > available as far as slurm is concerned.
> >
> > Brian
> > On 11/1/2019 10:52 AM, c b wrote:
> >
> > yes, there is enough memory for each of these jobs, and there is enough
> > memory to run the high resource and low resource jobs at the same time.
> >
> > On Fri, Nov 1, 2019 at 1:37 PM Brian Andrus  wrote:
> >
> >> Are you specifying memory for each of the jobs?
> >>
> >> Can't run a small job if there isn't enough memory available for it.
> >>
> >> Brian Andrus
> >> On 11/1/2019 7:42 AM, c b wrote:
> >>
> >> I have:
> >> SelectType=select/cons_res
> >> SelectTypeParameters=CR_CPU_Memory
> >>
> >> On Fri, Nov 1, 2019 at 10:39 AM Mark Hahn  wrote:
> >>
> >>> > In theory, these small jobs could slip in and run alongside the large
> >>> jobs,
> >>>
> >>> what are your SelectType and SelectTypeParameters settings?
> >>> ExclusiveUser=YES on partitions?
> >>>
> >>> regards, mark hahn.
> >>>
> >>>




Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread c b
I see - yes, to clarify, we are specifying memory for each of these jobs,
and there is enough memory on the nodes for both types of jobs to be
running simultaneously.

On Fri, Nov 1, 2019 at 1:59 PM Brian Andrus  wrote:

> I ask if you are specifying it, because if not, slurm will assume a job
> will use all the memory available.
>
> So without specifying, your big job gets allocated 100% of the memory so
> nothing could be sent to the node. Same if you don't specify for the little
> jobs. It would want 100%, but if anything is running there, 100% is not
> available as far as slurm is concerned.
>
> Brian
> On 11/1/2019 10:52 AM, c b wrote:
>
> yes, there is enough memory for each of these jobs, and there is enough
> memory to run the high resource and low resource jobs at the same time.
>
> On Fri, Nov 1, 2019 at 1:37 PM Brian Andrus  wrote:
>
>> Are you specifying memory for each of the jobs?
>>
>> Can't run a small job if there isn't enough memory available for it.
>>
>> Brian Andrus
>> On 11/1/2019 7:42 AM, c b wrote:
>>
>> I have:
>> SelectType=select/cons_res
>> SelectTypeParameters=CR_CPU_Memory
>>
>> On Fri, Nov 1, 2019 at 10:39 AM Mark Hahn  wrote:
>>
>>> > In theory, these small jobs could slip in and run alongside the large
>>> jobs,
>>>
>>> what are your SelectType and SelectTypeParameters settings?
>>> ExclusiveUser=YES on partitions?
>>>
>>> regards, mark hahn.
>>>
>>>


Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread Burian, John
I don’t know from experience if Slurm behaves unexpectedly with “unlimited” 
versus some large number, like 30 days; but barring something unexpected, it 
seems like time limits shouldn’t be the problem.

John


From: slurm-users  on behalf of c b 

Reply-To: Slurm User Community List 
Date: Friday, November 1, 2019 at 11:09 AM
To: Slurm User Community List 
Subject: Re: [slurm-users] job priority keeping resources from being used?

On my low resource jobs I'm setting the time to 1 hour, and on my large ones 
I'm setting time=unlimited.

Is the unlimited part the problem?  I have that setting because in my cluster 
there are some machines that come in and out during the day via reservations, 
and I want to keep these larger jobs from running on those machines.





On Fri, Nov 1, 2019 at 10:56 AM Burian, John 
mailto:john.bur...@nationwidechildrens.org>>
 wrote:
Are you setting realistic job run times (sbatch –t )?

Slurm won’t backfill low priority jobs (with low resource requirements) in 
front of a high priority job (blocked waiting on high resource requirements) if 
it thinks the low priority jobs will delay the eventual start of the high 
priority job. If all jobs are submitted with the same job run time, then Slurm 
will never backfill, because as far as Slurm knows, the low priority jobs will 
take longer to finish than just waiting for the current running jobs to finish.

John


From: slurm-users 
mailto:slurm-users-boun...@lists.schedmd.com>>
 on behalf of c b 
mailto:breedthoughts@gmail.com>>
Reply-To: Slurm User Community List 
mailto:slurm-users@lists.schedmd.com>>
Date: Friday, November 1, 2019 at 10:30 AM
To: "slurm-users@lists.schedmd.com<mailto:slurm-users@lists.schedmd.com>" 
mailto:slurm-users@lists.schedmd.com>>
Subject: [slurm-users] job priority keeping resources from being used?

[WARNING: External Email - Use Caution]


Hi,

Apologies for the weird subject line...I don't know how else to describe what 
I'm seeing.

Suppose my cluster has machines with 8 cores each.  I have many large high 
priority jobs that each require 6 cores, so each machine in my cluster runs one 
of each of these jobs at a time.  However, I also have lots of small jobs that 
each require one core, and these jobs have low priority so in my queue they are 
behind all my large jobs.

In theory, these small jobs could slip in and run alongside the large jobs, but 
I'm not seeing that happen.  So my machines have two cores sitting idle when 
they could be doing work.  How do I configure slurm to run these jobs better?

thanks for any help.



Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread Brian Andrus
I ask if you are specifying it, because if not, slurm will assume a job 
will use all the memory available.


So without specifying, your big job gets allocated 100% of the memory so 
nothing could be sent to the node. Same if you don't specify for the 
little jobs. It would want 100%, but if anything is running there, 100% 
is not available as far as slurm is concerned.


Brian

On 11/1/2019 10:52 AM, c b wrote:
yes, there is enough memory for each of these jobs, and there is 
enough memory to run the high resource and low resource jobs at the 
same time.


On Fri, Nov 1, 2019 at 1:37 PM Brian Andrus > wrote:


Are you specifying memory for each of the jobs?

Can't run a small job if there isn't enough memory available for it.

Brian Andrus

On 11/1/2019 7:42 AM, c b wrote:

I have:
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory

On Fri, Nov 1, 2019 at 10:39 AM Mark Hahn mailto:h...@mcmaster.ca>> wrote:

> In theory, these small jobs could slip in and run alongside
the large jobs,

what are your SelectType and SelectTypeParameters settings?
ExclusiveUser=YES on partitions?

regards, mark hahn.



Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread c b
yes, there is enough memory for each of these jobs, and there is enough
memory to run the high resource and low resource jobs at the same time.

On Fri, Nov 1, 2019 at 1:37 PM Brian Andrus  wrote:

> Are you specifying memory for each of the jobs?
>
> Can't run a small job if there isn't enough memory available for it.
>
> Brian Andrus
> On 11/1/2019 7:42 AM, c b wrote:
>
> I have:
> SelectType=select/cons_res
> SelectTypeParameters=CR_CPU_Memory
>
> On Fri, Nov 1, 2019 at 10:39 AM Mark Hahn  wrote:
>
>> > In theory, these small jobs could slip in and run alongside the large
>> jobs,
>>
>> what are your SelectType and SelectTypeParameters settings?
>> ExclusiveUser=YES on partitions?
>>
>> regards, mark hahn.
>>
>>


Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread Brian Andrus

Are you specifying memory for each of the jobs?

Can't run a small job if there isn't enough memory available for it.

Brian Andrus

On 11/1/2019 7:42 AM, c b wrote:

I have:
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory

On Fri, Nov 1, 2019 at 10:39 AM Mark Hahn > wrote:


> In theory, these small jobs could slip in and run alongside the
large jobs,

what are your SelectType and SelectTypeParameters settings?
ExclusiveUser=YES on partitions?

regards, mark hahn.



Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread c b
I tried setting a 5 minute time limit on some low resource jobs, and one
hour on high resource jobs, but my 5 minute jobs are still waiting behind
the hourlong jobs.

Can you suggest some combination of time limits that would work here?



On Fri, Nov 1, 2019 at 11:08 AM c b  wrote:

> On my low resource jobs I'm setting the time to 1 hour, and on my large
> ones I'm setting time=unlimited.
>
> Is the unlimited part the problem?  I have that setting because in my
> cluster there are some machines that come in and out during the day via
> reservations, and I want to keep these larger jobs from running on those
> machines.
>
>
>
>
>
> On Fri, Nov 1, 2019 at 10:56 AM Burian, John <
> john.bur...@nationwidechildrens.org> wrote:
>
>> Are you setting realistic job run times (sbatch –t )?
>>
>>
>>
>> Slurm won’t backfill low priority jobs (with low resource requirements)
>> in front of a high priority job (blocked waiting on high resource
>> requirements) if it thinks the low priority jobs will delay the eventual
>> start of the high priority job. If all jobs are submitted with the same job
>> run time, then Slurm will never backfill, because as far as Slurm knows,
>> the low priority jobs will take longer to finish than just waiting for the
>> current running jobs to finish.
>>
>>
>>
>> John
>>
>>
>>
>>
>>
>> *From: *slurm-users  on behalf of
>> c b 
>> *Reply-To: *Slurm User Community List 
>> *Date: *Friday, November 1, 2019 at 10:30 AM
>> *To: *"slurm-users@lists.schedmd.com" 
>> *Subject: *[slurm-users] job priority keeping resources from being used?
>>
>>
>>
>> [WARNING: External Email - Use Caution]
>>
>>
>>
>> Hi,
>>
>>
>>
>> Apologies for the weird subject line...I don't know how else to describe
>> what I'm seeing.
>>
>>
>>
>> Suppose my cluster has machines with 8 cores each.  I have many large
>> high priority jobs that each require 6 cores, so each machine in my cluster
>> runs one of each of these jobs at a time.  However, I also have lots of
>> small jobs that each require one core, and these jobs have low priority so
>> in my queue they are behind all my large jobs.
>>
>>
>>
>> In theory, these small jobs could slip in and run alongside the large
>> jobs, but I'm not seeing that happen.  So my machines have two cores
>> sitting idle when they could be doing work.  How do I configure slurm to
>> run these jobs better?
>>
>>
>>
>> thanks for any help.
>>
>>
>>
>


Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread c b
On my low resource jobs I'm setting the time to 1 hour, and on my large
ones I'm setting time=unlimited.

Is the unlimited part the problem?  I have that setting because in my
cluster there are some machines that come in and out during the day via
reservations, and I want to keep these larger jobs from running on those
machines.





On Fri, Nov 1, 2019 at 10:56 AM Burian, John <
john.bur...@nationwidechildrens.org> wrote:

> Are you setting realistic job run times (sbatch –t )?
>
>
>
> Slurm won’t backfill low priority jobs (with low resource requirements) in
> front of a high priority job (blocked waiting on high resource
> requirements) if it thinks the low priority jobs will delay the eventual
> start of the high priority job. If all jobs are submitted with the same job
> run time, then Slurm will never backfill, because as far as Slurm knows,
> the low priority jobs will take longer to finish than just waiting for the
> current running jobs to finish.
>
>
>
> John
>
>
>
>
>
> *From: *slurm-users  on behalf of
> c b 
> *Reply-To: *Slurm User Community List 
> *Date: *Friday, November 1, 2019 at 10:30 AM
> *To: *"slurm-users@lists.schedmd.com" 
> *Subject: *[slurm-users] job priority keeping resources from being used?
>
>
>
> [WARNING: External Email - Use Caution]
>
>
>
> Hi,
>
>
>
> Apologies for the weird subject line...I don't know how else to describe
> what I'm seeing.
>
>
>
> Suppose my cluster has machines with 8 cores each.  I have many large high
> priority jobs that each require 6 cores, so each machine in my cluster runs
> one of each of these jobs at a time.  However, I also have lots of small
> jobs that each require one core, and these jobs have low priority so in my
> queue they are behind all my large jobs.
>
>
>
> In theory, these small jobs could slip in and run alongside the large
> jobs, but I'm not seeing that happen.  So my machines have two cores
> sitting idle when they could be doing work.  How do I configure slurm to
> run these jobs better?
>
>
>
> thanks for any help.
>
>
>


Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread Burian, John
Are you setting realistic job run times (sbatch –t )?

Slurm won’t backfill low priority jobs (with low resource requirements) in 
front of a high priority job (blocked waiting on high resource requirements) if 
it thinks the low priority jobs will delay the eventual start of the high 
priority job. If all jobs are submitted with the same job run time, then Slurm 
will never backfill, because as far as Slurm knows, the low priority jobs will 
take longer to finish than just waiting for the current running jobs to finish.

John


From: slurm-users  on behalf of c b 

Reply-To: Slurm User Community List 
Date: Friday, November 1, 2019 at 10:30 AM
To: "slurm-users@lists.schedmd.com" 
Subject: [slurm-users] job priority keeping resources from being used?

[WARNING: External Email - Use Caution]


Hi,

Apologies for the weird subject line...I don't know how else to describe what 
I'm seeing.

Suppose my cluster has machines with 8 cores each.  I have many large high 
priority jobs that each require 6 cores, so each machine in my cluster runs one 
of each of these jobs at a time.  However, I also have lots of small jobs that 
each require one core, and these jobs have low priority so in my queue they are 
behind all my large jobs.

In theory, these small jobs could slip in and run alongside the large jobs, but 
I'm not seeing that happen.  So my machines have two cores sitting idle when 
they could be doing work.  How do I configure slurm to run these jobs better?

thanks for any help.



Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread c b
I have:
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory

On Fri, Nov 1, 2019 at 10:39 AM Mark Hahn  wrote:

> > In theory, these small jobs could slip in and run alongside the large
> jobs,
>
> what are your SelectType and SelectTypeParameters settings?
> ExclusiveUser=YES on partitions?
>
> regards, mark hahn.
>
>


[slurm-users] job priority keeping resources from being used?

2019-11-01 Thread c b
Hi,

Apologies for the weird subject line...I don't know how else to describe
what I'm seeing.

Suppose my cluster has machines with 8 cores each.  I have many large high
priority jobs that each require 6 cores, so each machine in my cluster runs
one of each of these jobs at a time.  However, I also have lots of small
jobs that each require one core, and these jobs have low priority so in my
queue they are behind all my large jobs.

In theory, these small jobs could slip in and run alongside the large jobs,
but I'm not seeing that happen.  So my machines have two cores sitting idle
when they could be doing work.  How do I configure slurm to run these jobs
better?

thanks for any help.