Re: [slurm-users] How to read job accounting data long output? `sacct -l`

2022-12-14 Thread Marcus Wagner



Am 15.12.2022 um 08:23 schrieb Bjørn-Helge Mevik:

Marcus Wagner  writes:


it it important to know, that the json output seems to be broken.

First of all, it does not (compared to the normal output) obey to the truncate 
option -T.
But more important, I saw a job, where in a "day output" (-S  -E 
) no steps were recorded.
Using sacct -j  --json instead showed that job WITH steps.


It is hard to call it "broken" when it is documented behaviour:

  --jsonDump job information as JSON. All other formatting arguments will 
be ignored



That depends on what is meant with formatting argument.
To me, formatting arguments are "-b", "-l", "-o", "-p|-P"
Instead, I can use filtering arguments with --json, like "-u", "-p" etc. And I 
would assume, that -S, -E and -T are filtering options, not formatting options.

But as I explained before, not obeying to -T is bad behaviour. That is nothing I would 
call "broken".


But getting sometimes no steps for a job (if in a larger JSON-output with many 
jobs) and then getting the steps, if one asks specifically for that jobid. That 
is something I would call broken.

Best Marcus

--
Dipl.-Inf. Marcus Wagner

IT Center
Gruppe: Server, Storage, HPC
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de

Social Media Kanäle des IT Centers:
https://blog.rwth-aachen.de/itc/
https://www.facebook.com/itcenterrwth
https://www.linkedin.com/company/itcenterrwth
https://twitter.com/ITCenterRWTH
https://www.youtube.com/channel/UCKKDJJukeRwO0LP-ac8x8rQ


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [slurm-users] How to read job accounting data long output? `sacct -l`

2022-12-14 Thread Bjørn-Helge Mevik
Marcus Wagner  writes:

> it it important to know, that the json output seems to be broken.
>
> First of all, it does not (compared to the normal output) obey to the 
> truncate option -T.
> But more important, I saw a job, where in a "day output" (-S  -E 
> ) no steps were recorded.
> Using sacct -j  --json instead showed that job WITH steps.

It is hard to call it "broken" when it is documented behaviour:

 --jsonDump job information as JSON. All other formatting arguments will be 
ignored

-- 
Cheers,
Bjørn-Helge


signature.asc
Description: PGP signature


Re: [slurm-users] How to read job accounting data long output? `sacct -l`

2022-12-14 Thread Marcus Wagner

Hi Bjørn-Helge,

it it important to know, that the json output seems to be broken.

First of all, it does not (compared to the normal output) obey to the truncate 
option -T.
But more important, I saw a job, where in a "day output" (-S  -E 
) no steps were recorded.
Using sacct -j  --json instead showed that job WITH steps.

Best
Marcus

Am 14.12.2022 um 08:19 schrieb Bjørn-Helge Mevik:

Chandler Sobel-Sorenson  writes:


Perhaps there is a way to import it into a spreadsheet?


You can use `sacct -P -l`, which gives you a '|' separated output, which
should be possible to import in a spread sheet.

(Personally I only use `-l` when I'm looking for the name of an
attribute and am to lazy to read the man page.  Then I use -o to specify
what I want returned.)

Also, in newer versions at least, there is --json and --yaml to give you
output which you can parse with other tools (or read, if you really want :).



--
Dipl.-Inf. Marcus Wagner

IT Center
Gruppe: Server, Storage, HPC
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de

Social Media Kanäle des IT Centers:
https://blog.rwth-aachen.de/itc/
https://www.facebook.com/itcenterrwth
https://www.linkedin.com/company/itcenterrwth
https://twitter.com/ITCenterRWTH
https://www.youtube.com/channel/UCKKDJJukeRwO0LP-ac8x8rQ


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [slurm-users] CPUSpecList confusion

2022-12-14 Thread Marcus Wagner

Hi Paul,

as Slurm uses hwloc, I was looking into these tools a little bit deeper.
Using your script, I saw e.g. the following output on one node:

=== 31495434
CPU_IDs=21-23,25
21-23,25
=== 31495433
CPU_IDs=16-18,20
10-11,15,17
=== 31487399
CPU_IDs=15
9

That does not match your schemes and on first sight seems to be more random.

It seems, Slurm uses hwlocs logical indices, whereas cgroups uses the 
OS/physical indices.
According to the example above (excerpt of the full output of hwloc-ls)

  NUMANode L#1 (P#1 47GB)
  L2 L#12 (1024KB) + L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12 + PU 
L#12 (P#3)
  L2 L#13 (1024KB) + L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13 + PU 
L#13 (P#4)
  L2 L#14 (1024KB) + L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14 + PU 
L#14 (P#5)
  L2 L#15 (1024KB) + L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15 + PU 
L#15 (P#9)
  L2 L#16 (1024KB) + L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16 + PU 
L#16 (P#10)
  L2 L#17 (1024KB) + L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17 + PU 
L#17 (P#11)
  L2 L#18 (1024KB) + L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18 + PU 
L#18 (P#15)
  L2 L#19 (1024KB) + L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19 + PU 
L#19 (P#16)
  L2 L#20 (1024KB) + L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20 + PU 
L#20 (P#17)
  L2 L#21 (1024KB) + L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21 + PU 
L#21 (P#21)
  L2 L#22 (1024KB) + L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22 + PU 
L#22 (P#22)
  L2 L#23 (1024KB) + L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23 + PU 
L#23 (P#23)


That does seem to match.

and in short, to get the mapping, one can use
$> hwloc-ls --only pu
...
PU L#10 (P#19)
PU L#11 (P#20)
PU L#12 (P#3)
PU L#13 (P#4)
PU L#14 (P#5)
PU L#15 (P#9)
PU L#16 (P#10)
PU L#17 (P#11)
PU L#18 (P#15)
PU L#19 (P#16)
PU L#20 (P#17)
PU L#21 (P#21)
PU L#22 (P#22)
PU L#23 (P#23)
...


Best
Marcus

Am 14.12.2022 um 18:11 schrieb Paul Raines:

Ugh.  Guess I cannot count.  The mapping on that last node DOES work with the 
"alternating" scheme where we have

  0  0
  1  2
  2  4
  3  6
  4  8
  5 10
  6 12
  7 14
  8 16
  9 18
10 20
11 22
12  1
13  3
14  5
15  7
16  9
17 11
18 13
19 15
20 17
21 19
22 21
23 23

so CPU_IDs=8-11,20-23 does correspond to cgroup 16-23

Using the script

cd /sys/fs/cgroup/cpuset/slurm
for d in $(find -name 'job*') ; do
   j=$(echo $d | cut -d_ -f3)
   echo === $j
   scontrol -d show job $j | grep CPU_ID | cut -d' ' -f7
   cat $d/cpuset.effective_cpus
done

=== 1967214
CPU_IDs=8-11,20-23
16-23
=== 1960208
CPU_IDs=12-19
1,3,5,7,9,11,13,15
=== 1966815
CPU_IDs=0
0
=== 1966821
CPU_IDs=6
12
=== 1966818
CPU_IDs=3
6
=== 1966816
CPU_IDs=1
2
=== 1966822
CPU_IDs=7
14
=== 1966820
CPU_IDs=5
10
=== 1966819
CPU_IDs=4
8
=== 1966817
CPU_IDs=2
4

On all my nodes I see just two schemes.  The alternating odd/even one above and 
one that is does not alternate like on this box with

CPUs=32 Boards=1 SocketsPerBoard=2 CoresPerSocket=16 ThreadsPerCore=1

=== 1966495
CPU_IDs=0-2
0-2
=== 1966498
CPU_IDs=10-12
10-12
=== 1966502
CPU_IDs=26-28
26-28
=== 1960064
CPU_IDs=7-9,13-25
7-9,13-25
=== 1954480
CPU_IDs=3-6
3-6


On Wed, 14 Dec 2022 9:42am, Paul Raines wrote:



Yes, I see that on some of my other machines too.  So apicid is definitely not 
what SLURM is using but somehow just lines up that way on this one machine I 
have.

I think the issue is cgroups counts starting at 0 all the cores on the first 
socket, then all the cores on the second socket.  But SLURM (on a two socket 
box) counts 0 as the first core on the first socket, 1 as the first core on the 
second socket, 2 as the second core on the first socket,
3 as the second core on the second socket, and so on. (Looks like I am
wrong: see below)

Why slurm does this instead of just using the assignments cgroups uses
I have no idea.  Hopefully one of the SLURM developers reads this
and can explain

Looking at another SLURM node I have (where cgroups v1 is still in use
and HT turned off) with definition

CPUs=24 Boards=1 SocketsPerBoard=2 CoresPerSocket=12 ThreadsPerCore=1

I find

[root@r440-17 ~]# egrep '^(apicid|proc)' /proc/cpuinfo  | tail -4
processor   : 22
apicid  : 22
processor   : 23
apicid  : 54

So apicid's are NOT going to work

# scontrol -d show job 1966817 | grep CPU_ID
    Nodes=r17 CPU_IDs=2 Mem=16384 GRES=
# cat /sys/fs/cgroup/cpuset/slurm/uid_3776056/job_1966817/cpuset.cpus
4

If Slurm has '2' this should be the second core on the first socket so should 
be '1' in cgroups, but it is 4 as we see above which is the fifth core on the 
first socket.  So I guess I was wrong above.

But in /proc/cpuinfo the apicid for processor 4 is 2!!!  So is apicid
right after all?  Nope, on the same machine I have

# scontrol -d show job 1960208 | grep CPU_ID
    Nodes=r17 CPU_IDs=12-19 Mem=51200 GRES=
# cat /sys/fs/cgroup/cpuset/slurm/uid_5164679/job_1960208/cpuset.cpus
1,3,5,7,9,11,13,15

and in /proc/cpuinfo the apcid for processor 12 is 1

Re: [slurm-users] How to read job accounting data long output? `sacct -l`

2022-12-14 Thread Paul Edmon

The seff utility (in slurm-contribs) also gives good summary info.

You can also you --parsable to make things more managable.

-Paul Edmon-

On 12/14/22 3:41 PM, Ross Dickson wrote:
I wrote a simple Python script to transpose the output of sacct from a 
row into a column.  See if it meets your needs.


https://github.com/ComputeCanada/slurm_utils/blob/master/sacct-all.py

- Ross Dickson
Dalhousie University  /  ACENET  /  Digital Research Alliance of Canada


On Wed, Dec 14, 2022 at 1:16 PM Davide DelVento 
 wrote:


It would be very useful if there were a way (perhaps a custom script
parsing the sacct output) to provide the information in the same
format as "scontrol show job"

Has anybody attempted to do that?


Re: [slurm-users] How to read job accounting data long output? `sacct -l`

2022-12-14 Thread Ross Dickson
I wrote a simple Python script to transpose the output of sacct from a row
into a column.  See if it meets your needs.

 https://github.com/ComputeCanada/slurm_utils/blob/master/sacct-all.py

- Ross Dickson
Dalhousie University  /  ACENET  /  Digital Research Alliance of Canada


On Wed, Dec 14, 2022 at 1:16 PM Davide DelVento 
wrote:

> It would be very useful if there were a way (perhaps a custom script
> parsing the sacct output) to provide the information in the same
> format as "scontrol show job"
>
> Has anybody attempted to do that?
>
>


Re: [slurm-users] How to read job accounting data long output? `sacct -l`

2022-12-14 Thread Davide DelVento
It would be very useful if there were a way (perhaps a custom script
parsing the sacct output) to provide the information in the same
format as "scontrol show job"

Has anybody attempted to do that?


On Wed, Dec 14, 2022 at 1:25 AM Will Furnass  wrote:
>
> If you pipe output into 'less -S' then you get horizontal scrolling.
>
> Will
>
> On Wed, 14 Dec 2022, 07:03 Chandler Sobel-Sorenson, 
>  wrote:
>>
>> Is there a recommended way to read output from `sacct` involving `-l` or 
>> `--long` option?  I have dual monitors and shrunk the terminal's font down 
>> to 6 pt or so until I could barely read it, giving me 675 columns.  This was 
>> still not enough...
>>
>> Perhaps there is a way of displaying it so the lines don't wrap and I can 
>> use left/right arrow keys to scroll the output, much like `systemctl` and 
>> `journalctl` can do?
>>
>> Perhaps there is a way to import it into a spreadsheet?
>>
>> This was with version 19.05 at least.  Apologies if the output has changed 
>> in newer versions...
>>
>> Thanks
>>
>>



Re: [slurm-users] CPUSpecList confusion

2022-12-14 Thread Paul Raines
Ugh.  Guess I cannot count.  The mapping on that last node DOES work with 
the "alternating" scheme where we have


 0  0
 1  2
 2  4
 3  6
 4  8
 5 10
 6 12
 7 14
 8 16
 9 18
10 20
11 22
12  1
13  3
14  5
15  7
16  9
17 11
18 13
19 15
20 17
21 19
22 21
23 23

so CPU_IDs=8-11,20-23 does correspond to cgroup 16-23

Using the script

cd /sys/fs/cgroup/cpuset/slurm
for d in $(find -name 'job*') ; do
  j=$(echo $d | cut -d_ -f3)
  echo === $j
  scontrol -d show job $j | grep CPU_ID | cut -d' ' -f7
  cat $d/cpuset.effective_cpus
done

=== 1967214
CPU_IDs=8-11,20-23
16-23
=== 1960208
CPU_IDs=12-19
1,3,5,7,9,11,13,15
=== 1966815
CPU_IDs=0
0
=== 1966821
CPU_IDs=6
12
=== 1966818
CPU_IDs=3
6
=== 1966816
CPU_IDs=1
2
=== 1966822
CPU_IDs=7
14
=== 1966820
CPU_IDs=5
10
=== 1966819
CPU_IDs=4
8
=== 1966817
CPU_IDs=2
4

On all my nodes I see just two schemes.  The alternating odd/even one 
above and one that is does not alternate like on this box with


CPUs=32 Boards=1 SocketsPerBoard=2 CoresPerSocket=16 ThreadsPerCore=1

=== 1966495
CPU_IDs=0-2
0-2
=== 1966498
CPU_IDs=10-12
10-12
=== 1966502
CPU_IDs=26-28
26-28
=== 1960064
CPU_IDs=7-9,13-25
7-9,13-25
=== 1954480
CPU_IDs=3-6
3-6


On Wed, 14 Dec 2022 9:42am, Paul Raines wrote:



Yes, I see that on some of my other machines too.  So apicid is definitely 
not what SLURM is using but somehow just lines up that way on this one 
machine I have.


I think the issue is cgroups counts starting at 0 all the cores on the first 
socket, then all the cores on the second socket.  But SLURM (on a two socket 
box) counts 0 as the first core on the first socket, 1 as the first core on 
the second socket, 2 as the second core on the first socket,

3 as the second core on the second socket, and so on. (Looks like I am
wrong: see below)

Why slurm does this instead of just using the assignments cgroups uses
I have no idea.  Hopefully one of the SLURM developers reads this
and can explain

Looking at another SLURM node I have (where cgroups v1 is still in use
and HT turned off) with definition

CPUs=24 Boards=1 SocketsPerBoard=2 CoresPerSocket=12 ThreadsPerCore=1

I find

[root@r440-17 ~]# egrep '^(apicid|proc)' /proc/cpuinfo  | tail -4
processor   : 22
apicid  : 22
processor   : 23
apicid  : 54

So apicid's are NOT going to work

# scontrol -d show job 1966817 | grep CPU_ID
Nodes=r17 CPU_IDs=2 Mem=16384 GRES=
# cat /sys/fs/cgroup/cpuset/slurm/uid_3776056/job_1966817/cpuset.cpus
4

If Slurm has '2' this should be the second core on the first socket so should 
be '1' in cgroups, but it is 4 as we see above which is the fifth core on the 
first socket.  So I guess I was wrong above.


But in /proc/cpuinfo the apicid for processor 4 is 2!!!  So is apicid
right after all?  Nope, on the same machine I have

# scontrol -d show job 1960208 | grep CPU_ID
Nodes=r17 CPU_IDs=12-19 Mem=51200 GRES=
# cat /sys/fs/cgroup/cpuset/slurm/uid_5164679/job_1960208/cpuset.cpus
1,3,5,7,9,11,13,15

and in /proc/cpuinfo the apcid for processor 12 is 16

# scontrol -d show job 1967214 | grep CPU_ID
Nodes=r17 CPU_IDs=8-11,20-23 Mem=51200 GRES=
# cat /sys/fs/cgroup/cpuset/slurm/uid_5164679/job_1967214/cpuset.cpus
16-23

I am totally lost now. Seems totally random. SLURM devs?  Any insight?


-- Paul Raines (http://help.nmr.mgh.harvard.edu)



On Wed, 14 Dec 2022 1:33am, Marcus Wagner wrote:


 Hi Paul,

 sorry to say, but that has to be some coincidence on your system. I've
 never seen Slurm reporting using corenumbers, which are higher than the
 total number of cores.

 I have e.g. a intel Platinum 8160 here. 24 Cores per Socket, no
 HyperThreading activated.
 Yet here the last lines of /proc/cpuinfo:

 processor   : 43
 apicid  : 114
 processor   : 44
 apicid  : 116
 processor   : 45
 apicid  : 118
 processor   : 46
 apicid  : 120
 processor   : 47
 apicid  : 122

 Never seen Slurm reporting corenumbers for a job > 96
 Nonetheless, I agree, the cores reported by Slurm mostly have nothing to
 do with the cores reported e.g. by cgroups.
 Since Slurm creates the cgroups, I wonder, why they report some kind of
 abstract coreid, because they should know, which cores are used, as they
 create the cgroups for the jobs.

 Best
 Marcus

 Am 13.12.2022 um 16:39 schrieb Paul Raines:


  Yes, looks like SLURM is using the apicid that is in /proc/cpuinfo
  The first 14 cpus in /proc/cpu (procs 0-13) have apicid
  0,2,4,6,8,10,12,14,16,20,22,24,26,28 in /proc/cpuinfo

  So after setting CpuSpecList=0,2,4,6,8,10,12,14,16,18,20,22,24,26
  in slurm.conf it appears to be doing what I want

  $ echo $SLURM_JOB_ID
  9
  $ grep -i ^cpu /proc/self/status
  Cpus_allowed:   000f,000f
  Cpus_allowed_list:  16-19,48-51
  $ scontrol -d show job 9 | grep CPU_ID
    Nodes=larkin CPU_IDs=32-39 Mem=25600 GRES=

  apcid=32 is processor=16 and apcid=33 is processor=48 in /proc/cpuinfo

  Thanks

  -- Paul Raines (http://help.nmr.mgh.harvard

[slurm-users] setting job table index start value

2022-12-14 Thread rstory+slurm
Hello,

We've been working on upgrading from slurm 20 to the latest slurm 22
and ran into the issue of extremely slow database schema migration. One
workaround I found a reference to suggested just starting slurm 22 with
a fresh database and converting the old database offline and importing
later. Obviously this would require starting the new database table
indexes higher than the existing ones.

I haven't delved into the schema to figure out how are the tables are
related and I was hoping someone else had figured this out and maybe
written a script to deal with it.

Any/all help or suggestions appreciated.
 
Thanks,
Robert



Re: [slurm-users] CPUSpecList confusion

2022-12-14 Thread Paul Raines


Yes, I see that on some of my other machines too.  So apicid is definitely 
not what SLURM is using but somehow just lines up that way on this one 
machine I have.


I think the issue is cgroups counts starting at 0 all the cores on the 
first socket, then all the cores on the second socket.  But SLURM (on a 
two socket box) counts 0 as the first core on the first socket, 1 as the 
first core on the second socket, 2 as the second core on the first socket,

3 as the second core on the second socket, and so on. (Looks like I am
wrong: see below)

Why slurm does this instead of just using the assignments cgroups uses
I have no idea.  Hopefully one of the SLURM developers reads this
and can explain

Looking at another SLURM node I have (where cgroups v1 is still in use
and HT turned off) with definition

CPUs=24 Boards=1 SocketsPerBoard=2 CoresPerSocket=12 ThreadsPerCore=1

I find

[root@r440-17 ~]# egrep '^(apicid|proc)' /proc/cpuinfo  | tail -4
processor   : 22
apicid  : 22
processor   : 23
apicid  : 54

So apicid's are NOT going to work

# scontrol -d show job 1966817 | grep CPU_ID
 Nodes=r17 CPU_IDs=2 Mem=16384 GRES=
# cat /sys/fs/cgroup/cpuset/slurm/uid_3776056/job_1966817/cpuset.cpus
4

If Slurm has '2' this should be the second core on the first socket so 
should be '1' in cgroups, but it is 4 as we see above which is the fifth 
core on the first socket.  So I guess I was wrong above.


But in /proc/cpuinfo the apicid for processor 4 is 2!!!  So is apicid
right after all?  Nope, on the same machine I have

# scontrol -d show job 1960208 | grep CPU_ID
 Nodes=r17 CPU_IDs=12-19 Mem=51200 GRES=
# cat /sys/fs/cgroup/cpuset/slurm/uid_5164679/job_1960208/cpuset.cpus
1,3,5,7,9,11,13,15

and in /proc/cpuinfo the apcid for processor 12 is 16

# scontrol -d show job 1967214 | grep CPU_ID
 Nodes=r17 CPU_IDs=8-11,20-23 Mem=51200 GRES=
# cat /sys/fs/cgroup/cpuset/slurm/uid_5164679/job_1967214/cpuset.cpus
16-23

I am totally lost now. Seems totally random. SLURM devs?  Any insight?


-- Paul Raines (http://help.nmr.mgh.harvard.edu)



On Wed, 14 Dec 2022 1:33am, Marcus Wagner wrote:


Hi Paul,

sorry to say, but that has to be some coincidence on your system. I've never 
seen Slurm reporting using corenumbers, which are higher than the total 
number of cores.


I have e.g. a intel Platinum 8160 here. 24 Cores per Socket, no 
HyperThreading activated.

Yet here the last lines of /proc/cpuinfo:

processor   : 43
apicid  : 114
processor   : 44
apicid  : 116
processor   : 45
apicid  : 118
processor   : 46
apicid  : 120
processor   : 47
apicid  : 122

Never seen Slurm reporting corenumbers for a job > 96
Nonetheless, I agree, the cores reported by Slurm mostly have nothing to do 
with the cores reported e.g. by cgroups.
Since Slurm creates the cgroups, I wonder, why they report some kind of 
abstract coreid, because they should know, which cores are used, as they 
create the cgroups for the jobs.


Best
Marcus

Am 13.12.2022 um 16:39 schrieb Paul Raines:


 Yes, looks like SLURM is using the apicid that is in /proc/cpuinfo
 The first 14 cpus in /proc/cpu (procs 0-13) have apicid
 0,2,4,6,8,10,12,14,16,20,22,24,26,28 in /proc/cpuinfo

 So after setting CpuSpecList=0,2,4,6,8,10,12,14,16,18,20,22,24,26
 in slurm.conf it appears to be doing what I want

 $ echo $SLURM_JOB_ID
 9
 $ grep -i ^cpu /proc/self/status
 Cpus_allowed:   000f,000f
 Cpus_allowed_list:  16-19,48-51
 $ scontrol -d show job 9 | grep CPU_ID
   Nodes=larkin CPU_IDs=32-39 Mem=25600 GRES=

 apcid=32 is processor=16 and apcid=33 is processor=48 in /proc/cpuinfo

 Thanks

 -- Paul Raines (http://help.nmr.mgh.harvard.edu)



 On Tue, 13 Dec 2022 9:52am, Sean Maxwell wrote:


    External Email - Use Caution
 In the slurm.conf manual they state the CpuSpecList ids are "abstract",
 and
 in the CPU management docs they enforce the notion that the abstract
 Slurm
 IDs are not related to the Linux hardware IDs, so that is probably the
 source of the behavior. I unfortunately don't have more information.

 On Tue, Dec 13, 2022 at 9:45 AM Paul Raines 
 wrote:



 Hmm.  Actually looks like confusion between CPU IDs on the system
 and what SLURM thinks the IDs are

 # scontrol -d show job 8
 ...
   Nodes=foobar CPU_IDs=14-21 Mem=25600 GRES=
 ...

 # cat
 /sys/fs/cgroup/system.slice/slurmstepd.scope/job_8/cpuset.cpus.effective
 7-10,39-42


 -- Paul Raines
 
(http://secure-web.cisco.com/1w33sdTB1gUzmmNOl1cd8t7VHLUOemWW6ExRIq2AHSLm0BwRxhnfCCHDdln0LWn7IZ3IUYdxeX2HzyDj7CeKHq7B1H5ek2tow-D_4Q81mK8_x_AKf6cHYOSqHSBelLikTijDZJGsJYKSleSUlZMG1mqkU4e4TirhUu0qTLKUcvqLxsvi1WCbBbyUaDUxd-c7kE2_v4XzvhBtdEqrkKAWOQF2WoJwhmTJlMhanBk-PdjHDsuDcdOgfHrmIAiRC-T8hB094Y1WvEuOjL4o2Kbx28qp4eUSPu8jSOxPEKoWsHpSDE7fWyjrlcVAsEyOpPgp4/http%3A%2F%2Fhelp.nmr.mgh.harvard.edu)



 On Tue, 13 Dec 2022 9:40am, Paul Raines wrote:

> 
>  Oh but that does explain the CfgTRES=cpu=14.  With the Cp

Re: [slurm-users] How to read job accounting data long output? `sacct -l`

2022-12-14 Thread Will Furnass
If you pipe output into 'less -S' then you get horizontal scrolling.

Will

On Wed, 14 Dec 2022, 07:03 Chandler Sobel-Sorenson, <
chand...@genome.arizona.edu> wrote:

> Is there a recommended way to read output from `sacct` involving `-l` or
> `--long` option?  I have dual monitors and shrunk the terminal's font down
> to 6 pt or so until I could barely read it, giving me 675 columns.  This
> was still not enough...
>
> Perhaps there is a way of displaying it so the lines don't wrap and I can
> use left/right arrow keys to scroll the output, much like `systemctl` and
> `journalctl` can do?
>
> Perhaps there is a way to import it into a spreadsheet?
>
> This was with version 19.05 at least.  Apologies if the output has changed
> in newer versions...
>
> Thanks
>
>
>