Re: [slurm-users] MaxRSS not showing up in sacct

2019-09-22 Thread Chris Samuel
On Monday, 16 September 2019 2:58:54 PM PDT Brian Andrus wrote:

> JobAcctGatherType   = jobacct_gather/linux

I know that's the general advice from SchedMD, but I've always found its 
numbers to be unreliable and I'd suggest it's worth trying

JobAcctGatherType=jobacct_gather/cgroup 

as an experiment instead.

All the best,
Chris
-- 
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA






Re: [slurm-users] MaxRSS not showing up in sacct

2019-09-16 Thread Brian Andrus
One other thing I noticed is that the contents of the *_job_table has
entries in tres_alloc and tres_req that seem to match types in the
tres_table, but there are no mem entries.
For example, tres_table=
+---+-+--++--+
| creation_time | deleted | id   | type   | name |
+---+-+--++--+
|1559250721 |   0 |1 | cpu|  |
|1559250721 |   0 |2 | mem|  |
|1559250721 |   0 |3 | energy |  |
|1559250721 |   0 |4 | node   |  |
|1559250721 |   0 |5 | billing|  |
|1559250721 |   0 |6 | fs | disk |
|1559250721 |   0 |7 | vmem   |  |
|1559250721 |   0 |8 | pages  |  |
|1559250721 |   1 | 1000 | dynamic_offset |  |
+---+-+--++--+

But none of the jobs poplate a value for 2 (mem):
++-++
| id_job | tres_req| tres_alloc |
++-++
|  19779 | 1=1,4=1,5=1 | 1=4,4=1,5=4|
|  19780 | 1=1,4=1,5=1 | 1=4,4=1,5=4|
|  19781 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 |
|  19782 | 1=1,4=1,5=1 | 1=16,4=1,5=16  |
|  19783 | 1=1,4=1,5=1 | 1=16,4=1,5=16  |
|  19784 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 |
|  19785 | 1=1,4=1,5=1 | 1=16,4=1,5=16  |
|  19786 | 1=1,4=1,5=1 | 1=16,4=1,5=16  |
|  19787 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 |
|  19788 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 |
++-++

Brian Andrus


On Mon, Sep 16, 2019 at 2:58 PM Brian Andrus  wrote:

> I have
> JobAcctGatherType   = jobacct_gather/linux
>
> Brian
>
> On Mon, Sep 16, 2019 at 12:40 PM Antony Cleave 
> wrote:
>
>> Just a quick thought.
>>
>> What is your slurm.conf setting for this?
>>
>> *JobAcctGatherType* is operating system dependent and controls what
>> mechanism is used to collect accounting information. Supported values are
>> *jobacct_gather/linux* (recommended), *jobacct_gather/cgroup* and
>> *jobacct_gather/none* (no information collected).
>>
>> Antony
>>
>>
>> On Mon, 16 Sep 2019, 14:07 Brian Andrus,  wrote:
>>
>>> Yep, the maxrss field is always blank.
>>>
>>> I just checked on a different cluster and have the same result. Jobs
>>> that completed last week even have nothing in that field.
>>>
>>> So how can I troubleshoot this? Is there a way to log the sql queries
>>> made by slurmdbd?
>>>
>>> Brian
>>>
>>> On 9/15/2019 10:29 PM, Christopher Samuel wrote:
>>> > On 9/15/19 4:17 PM, Brian Andrus wrote:
>>> >
>>> >> Are steps required to capture Max RSS?
>>> >
>>> > No, you should see a MaxRSS reported for the batch step, for instance:
>>> >
>>> > $ sacct -j $JOBID -o jobid,jobname,maxrss
>>> >
>>> > All the best,
>>> > Chris
>>>
>>>


Re: [slurm-users] MaxRSS not showing up in sacct

2019-09-16 Thread Brian Andrus
I have
JobAcctGatherType   = jobacct_gather/linux

Brian

On Mon, Sep 16, 2019 at 12:40 PM Antony Cleave 
wrote:

> Just a quick thought.
>
> What is your slurm.conf setting for this?
>
> *JobAcctGatherType* is operating system dependent and controls what
> mechanism is used to collect accounting information. Supported values are
> *jobacct_gather/linux* (recommended), *jobacct_gather/cgroup* and
> *jobacct_gather/none* (no information collected).
>
> Antony
>
>
> On Mon, 16 Sep 2019, 14:07 Brian Andrus,  wrote:
>
>> Yep, the maxrss field is always blank.
>>
>> I just checked on a different cluster and have the same result. Jobs
>> that completed last week even have nothing in that field.
>>
>> So how can I troubleshoot this? Is there a way to log the sql queries
>> made by slurmdbd?
>>
>> Brian
>>
>> On 9/15/2019 10:29 PM, Christopher Samuel wrote:
>> > On 9/15/19 4:17 PM, Brian Andrus wrote:
>> >
>> >> Are steps required to capture Max RSS?
>> >
>> > No, you should see a MaxRSS reported for the batch step, for instance:
>> >
>> > $ sacct -j $JOBID -o jobid,jobname,maxrss
>> >
>> > All the best,
>> > Chris
>>
>>


Re: [slurm-users] MaxRSS not showing up in sacct

2019-09-16 Thread Antony Cleave
Just a quick thought.

What is your slurm.conf setting for this?

*JobAcctGatherType* is operating system dependent and controls what
mechanism is used to collect accounting information. Supported values are
*jobacct_gather/linux* (recommended), *jobacct_gather/cgroup* and
*jobacct_gather/none* (no information collected).

Antony


On Mon, 16 Sep 2019, 14:07 Brian Andrus,  wrote:

> Yep, the maxrss field is always blank.
>
> I just checked on a different cluster and have the same result. Jobs
> that completed last week even have nothing in that field.
>
> So how can I troubleshoot this? Is there a way to log the sql queries
> made by slurmdbd?
>
> Brian
>
> On 9/15/2019 10:29 PM, Christopher Samuel wrote:
> > On 9/15/19 4:17 PM, Brian Andrus wrote:
> >
> >> Are steps required to capture Max RSS?
> >
> > No, you should see a MaxRSS reported for the batch step, for instance:
> >
> > $ sacct -j $JOBID -o jobid,jobname,maxrss
> >
> > All the best,
> > Chris
>
>


Re: [slurm-users] MaxRSS not showing up in sacct

2019-09-16 Thread Brian Andrus

Yep, the maxrss field is always blank.

I just checked on a different cluster and have the same result. Jobs 
that completed last week even have nothing in that field.


So how can I troubleshoot this? Is there a way to log the sql queries 
made by slurmdbd?


Brian

On 9/15/2019 10:29 PM, Christopher Samuel wrote:

On 9/15/19 4:17 PM, Brian Andrus wrote:


Are steps required to capture Max RSS?


No, you should see a MaxRSS reported for the batch step, for instance:

$ sacct -j $JOBID -o jobid,jobname,maxrss

All the best,
Chris




Re: [slurm-users] MaxRSS not showing up in sacct

2019-09-15 Thread Christopher Samuel

On 9/15/19 4:17 PM, Brian Andrus wrote:


Are steps required to capture Max RSS?


No, you should see a MaxRSS reported for the batch step, for instance:

$ sacct -j $JOBID -o jobid,jobname,maxrss

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



Re: [slurm-users] MaxRSS not showing up in sacct

2019-09-15 Thread Brian Andrus

The jobs have definitely completed when I try to gather the info.

Brian

On 9/15/2019 4:01 PM, Steven Dick wrote:

I don't think it shows up until the job completes.

On Sat, Sep 14, 2019 at 2:25 AM Brian Andrus  wrote:

Quick question?
When I use sacct to show job stats, it always has a blank entry for the
MaxRSS field. Is there something that needs enabled to get that in?
I do see it if I use sstat while the job is running.

Brian Andrus






Re: [slurm-users] MaxRSS not showing up in sacct

2019-09-15 Thread Brian Andrus

Hmm. We are only using allocations and have slurm.conf configured with:

AccountingStorageEnforce=associations,nosteps

Are steps required to capture Max RSS?

Brian

On 9/15/2019 1:48 PM, Mark Hahn wrote:
When I use sacct to show job stats, it always has a blank entry for 
the MaxRSS field. Is there something that needs enabled to get that in?


missing for steps as well or only when using --allocations?

regards, mark hahn.





Re: [slurm-users] MaxRSS not showing up in sacct

2019-09-15 Thread Mark Hahn
When I use sacct to show job stats, it always has a blank entry for the 
MaxRSS field. Is there something that needs enabled to get that in?


missing for steps as well or only when using --allocations?

regards, mark hahn.



[slurm-users] MaxRSS not showing up in sacct

2019-09-14 Thread Brian Andrus

Quick question?
When I use sacct to show job stats, it always has a blank entry for the 
MaxRSS field. Is there something that needs enabled to get that in?

I do see it if I use sstat while the job is running.

Brian Andrus