[slurm-users] Disable exclusive flag for users

2022-03-24 Thread pankajd
Hi,

We have slurm 21.08.6 and GPUs in our compute nodes. We want to restrict /
disable the use of "exclusive" flag in srun for users. How should we do it?


--
Thanks and regards,

PVD


For assimilation and dissemination of knowledge, visit cakes.cdac.in 



[ C-DAC is on Social-Media too. Kindly follow us at:
Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]

This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.




Re: [slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread Stephen Cousins
If you want to have the same number of processes per node, like:

#PBS -l nodes=4:ppn=8

then what I am doing (maybe there is another way?) is:

#SBATCH --ntasks-per-node=8
#SBATCH --nodes=4
#SBATCH --mincpus=8

This is because "--ntasks-per-node" is actually "maximum number of tasks
per node" and "--nodes=4" means "minimum number of nodes". I'm sure other
variations (specifying --ntasks=32, --mincpus=8 and --nodes=4-4 might do it
too) but this one is what I've been using. I remember being surprised when
coming over from Torque to find that "--ntasks-per-node" and --nodes did
not mean what they so obviously seemed to mean.


Steve

On Thu, Mar 24, 2022 at 7:56 PM David Henkemeyer 
wrote:

> Thank you!  We recently converted from pbs, and I was converting “ppn=X”
> to “-n X”.  Does it make more sense to convert “ppn=X” to
> --“cpus-per-task=X”?
>
> Thanks again
> David
>
> On Thu, Mar 24, 2022 at 3:54 PM Thomas M. Payerle  wrote:
>
>> Although all three cases ( "-N 1 --cpus-per-task 64 -n 1", "-N 1
>> --cpus-per-task 1 -n 64", and "-N 1 --cpus-per-task 32 -n 2") will cause
>> Slurm to allocate 64 cores to the job, there can (and will) be differences
>> in the other respects.
>>
>> The variable SLURM_NTASKS will be set to the argument of the -n (aka
>> --ntasks) argument, and other Slurm variables will differ as well.
>>
>> More importantly, as others noted, srun will launch $SLURM_NTASKS
>> processes.  The mpirun/mpiexec/etc binaries of most MPI libraries will (if
>> compiled with support for Slurm) act similarly (and indeed, I believe most
>> use srun under the hood).
>>
>> If you are just using sbatch and launching a single process using 64
>> threads, then the different options are probably equivalent for most intent
>> and purposes.  Similar if you are doing a loop to start 64 single threaded
>> processes.  But those are simplistic cases, and just happen to "work" even
>> though you are "abusing" the scheduler options.  And even the cases wherein
>> it "works" is subject to unexpected failures (e.g. if one substitutes srun
>> for sbatch).
>>
>> The differences are most clear when the -N 1 flag is not given.
>> Generally, SLURM_NTASKS should be the number of MPI or similar tasks you
>> intend to start.  By default, it is assumed the tasks can support
>> distributed memory parallelism, so the scheduler by default assumes that it
>> can launch tasks on different nodes (the -N 1 flag you mentioned would
>> override that).  Each such task is assumed to need --cpus-per-task cores
>> which the scheduler assumes needs shared memory parallelism (i.e. must be
>> on the same node).
>> So without the -N 1, "--cpus-per-task 64 -n 1" will require 64 cores on a
>> single node, whereas "-n 64 --cpus-per-task 1" can result in the job being
>> assigned 64 cores on a single node to a single core on 64 nodes or any
>> combination in between with 64 cores.  The "--cpus-per-task 32 -n 2" will
>> either assign one node with 64 cores or 2 nodes with 32 cores each.
>>
>> As I said, although there are some simple cases where the different cases
>> are mostly functionally equivalent, I would recommend trying to use the
>> proper arguments --- "abusing" the arguments might work for a while but
>> will likely bite you in the end.  E.g., the 64 thread case should do
>> "--cpus-per-task 64", and the launching processes in the loop should
>> _probably_ do "-n 64" (assuming it can handle the tasks being assigned to
>> different nodes).
>>
>> On Thu, Mar 24, 2022 at 3:35 PM David Henkemeyer <
>> david.henkeme...@gmail.com> wrote:
>>
>>> Assuming -N is 1 (meaning, this job needs only one node), then is there
>>> a difference between any of these 3 flag combinations:
>>>
>>> -n 64 (leaving cpus-per-task to be the default of 1)
>>> --cpus-per-task  64 (leaving -n to be the default of 1)
>>> --cpus-per-task 32 -n 2
>>>
>>> As far as I can tell, there is no functional difference. But if there is
>>> even a subtle difference, I would love to know what it is!
>>>
>>> Thanks
>>> David
>>> --
>>> Sent from Gmail Mobile
>>>
>>
>>
>> --
>> Tom Payerle
>> DIT-ACIGS/Mid-Atlantic Crossroadspaye...@umd.edu
>> 5825 University Research Park   (301) 405-6135
>> University of Maryland
>> College Park, MD 20740-3831
>>
> --
> Sent from Gmail Mobile
>


Re: [slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread David Henkemeyer
Thank you!  We recently converted from pbs, and I was converting “ppn=X” to
“-n X”.  Does it make more sense to convert “ppn=X” to --“cpus-per-task=X”?

Thanks again
David

On Thu, Mar 24, 2022 at 3:54 PM Thomas M. Payerle  wrote:

> Although all three cases ( "-N 1 --cpus-per-task 64 -n 1", "-N 1
> --cpus-per-task 1 -n 64", and "-N 1 --cpus-per-task 32 -n 2") will cause
> Slurm to allocate 64 cores to the job, there can (and will) be differences
> in the other respects.
>
> The variable SLURM_NTASKS will be set to the argument of the -n (aka
> --ntasks) argument, and other Slurm variables will differ as well.
>
> More importantly, as others noted, srun will launch $SLURM_NTASKS
> processes.  The mpirun/mpiexec/etc binaries of most MPI libraries will (if
> compiled with support for Slurm) act similarly (and indeed, I believe most
> use srun under the hood).
>
> If you are just using sbatch and launching a single process using 64
> threads, then the different options are probably equivalent for most intent
> and purposes.  Similar if you are doing a loop to start 64 single threaded
> processes.  But those are simplistic cases, and just happen to "work" even
> though you are "abusing" the scheduler options.  And even the cases wherein
> it "works" is subject to unexpected failures (e.g. if one substitutes srun
> for sbatch).
>
> The differences are most clear when the -N 1 flag is not given.
> Generally, SLURM_NTASKS should be the number of MPI or similar tasks you
> intend to start.  By default, it is assumed the tasks can support
> distributed memory parallelism, so the scheduler by default assumes that it
> can launch tasks on different nodes (the -N 1 flag you mentioned would
> override that).  Each such task is assumed to need --cpus-per-task cores
> which the scheduler assumes needs shared memory parallelism (i.e. must be
> on the same node).
> So without the -N 1, "--cpus-per-task 64 -n 1" will require 64 cores on a
> single node, whereas "-n 64 --cpus-per-task 1" can result in the job being
> assigned 64 cores on a single node to a single core on 64 nodes or any
> combination in between with 64 cores.  The "--cpus-per-task 32 -n 2" will
> either assign one node with 64 cores or 2 nodes with 32 cores each.
>
> As I said, although there are some simple cases where the different cases
> are mostly functionally equivalent, I would recommend trying to use the
> proper arguments --- "abusing" the arguments might work for a while but
> will likely bite you in the end.  E.g., the 64 thread case should do
> "--cpus-per-task 64", and the launching processes in the loop should
> _probably_ do "-n 64" (assuming it can handle the tasks being assigned to
> different nodes).
>
> On Thu, Mar 24, 2022 at 3:35 PM David Henkemeyer <
> david.henkeme...@gmail.com> wrote:
>
>> Assuming -N is 1 (meaning, this job needs only one node), then is there a
>> difference between any of these 3 flag combinations:
>>
>> -n 64 (leaving cpus-per-task to be the default of 1)
>> --cpus-per-task  64 (leaving -n to be the default of 1)
>> --cpus-per-task 32 -n 2
>>
>> As far as I can tell, there is no functional difference. But if there is
>> even a subtle difference, I would love to know what it is!
>>
>> Thanks
>> David
>> --
>> Sent from Gmail Mobile
>>
>
>
> --
> Tom Payerle
> DIT-ACIGS/Mid-Atlantic Crossroadspaye...@umd.edu
> 5825 University Research Park   (301) 405-6135
> University of Maryland
> College Park, MD 20740-3831
>
-- 
Sent from Gmail Mobile


Re: [slurm-users] How to open a slurm support case

2022-03-24 Thread Fulcomer, Samuel
...it is a bit arcane, but it's not like we're funding lavish
lifestyles with our support payments. I would prefer to see a slightly more
differentiated support system, but this suffices...

On Thu, Mar 24, 2022 at 6:06 PM Sean Crosby  wrote:

> Hi Jeff,
>
> The support system is here - https://bugs.schedmd.com/
>
> Create an account, log in, and when creating a request, select your site
> from the Site selection box.
>
> Sean
> --
> *From:* slurm-users  on behalf of
> Jeffrey R. Lang 
> *Sent:* Friday, 25 March 2022 08:48
> *To:* slurm-users@lists.schedmd.com 
> *Subject:* [EXT] [slurm-users] How to open a slurm support case
>
> * External email: Please exercise caution *
> --
>
> Can someone provide me with instructions on how to open a support case
> with SchedMD?
>
>
>
> We have a support contract, but no where on their website can I find a
> link to open a case with them.
>
>
>
> Thanks,
>
> Jeff
>


Re: [slurm-users] How to open a slurm support case

2022-03-24 Thread Sean Crosby
Hi Jeff,

The support system is here - https://bugs.schedmd.com/

Create an account, log in, and when creating a request, select your site from 
the Site selection box.

Sean

From: slurm-users  on behalf of Jeffrey 
R. Lang 
Sent: Friday, 25 March 2022 08:48
To: slurm-users@lists.schedmd.com 
Subject: [EXT] [slurm-users] How to open a slurm support case

External email: Please exercise caution



Can someone provide me with instructions on how to open a support case with 
SchedMD?



We have a support contract, but no where on their website can I find a link to 
open a case with them.



Thanks,

Jeff


Re: [slurm-users] How to open a slurm support case

2022-03-24 Thread Jason Booth
Jeff,

 I will reach out to you directly.

-Jason

On Thu, Mar 24, 2022 at 3:51 PM Jeffrey R. Lang  wrote:

> Can someone provide me with instructions on how to open a support case
> with SchedMD?
>
>
>
> We have a support contract, but no where on their website can I find a
> link to open a case with them.
>
>
>
> Thanks,
>
> Jeff
>


-- 

Jason Booth
Director of Support, SchedMD LLC
Commercial Slurm Development and Support


[slurm-users] How to open a slurm support case

2022-03-24 Thread Jeffrey R. Lang
Can someone provide me with instructions on how to open a support case with 
SchedMD?

We have a support contract, but no where on their website can I find a link to 
open a case with them.

Thanks,
Jeff


Re: [slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread Thomas M. Payerle
Although all three cases ( "-N 1 --cpus-per-task 64 -n 1", "-N 1
--cpus-per-task 1 -n 64", and "-N 1 --cpus-per-task 32 -n 2") will cause
Slurm to allocate 64 cores to the job, there can (and will) be differences
in the other respects.

The variable SLURM_NTASKS will be set to the argument of the -n (aka
--ntasks) argument, and other Slurm variables will differ as well.

More importantly, as others noted, srun will launch $SLURM_NTASKS
processes.  The mpirun/mpiexec/etc binaries of most MPI libraries will (if
compiled with support for Slurm) act similarly (and indeed, I believe most
use srun under the hood).

If you are just using sbatch and launching a single process using 64
threads, then the different options are probably equivalent for most intent
and purposes.  Similar if you are doing a loop to start 64 single threaded
processes.  But those are simplistic cases, and just happen to "work" even
though you are "abusing" the scheduler options.  And even the cases wherein
it "works" is subject to unexpected failures (e.g. if one substitutes srun
for sbatch).

The differences are most clear when the -N 1 flag is not given.  Generally,
SLURM_NTASKS should be the number of MPI or similar tasks you intend to
start.  By default, it is assumed the tasks can support distributed memory
parallelism, so the scheduler by default assumes that it can launch tasks
on different nodes (the -N 1 flag you mentioned would override that).  Each
such task is assumed to need --cpus-per-task cores which the scheduler
assumes needs shared memory parallelism (i.e. must be on the same node).
So without the -N 1, "--cpus-per-task 64 -n 1" will require 64 cores on a
single node, whereas "-n 64 --cpus-per-task 1" can result in the job being
assigned 64 cores on a single node to a single core on 64 nodes or any
combination in between with 64 cores.  The "--cpus-per-task 32 -n 2" will
either assign one node with 64 cores or 2 nodes with 32 cores each.

As I said, although there are some simple cases where the different cases
are mostly functionally equivalent, I would recommend trying to use the
proper arguments --- "abusing" the arguments might work for a while but
will likely bite you in the end.  E.g., the 64 thread case should do
"--cpus-per-task 64", and the launching processes in the loop should
_probably_ do "-n 64" (assuming it can handle the tasks being assigned to
different nodes).

On Thu, Mar 24, 2022 at 3:35 PM David Henkemeyer 
wrote:

> Assuming -N is 1 (meaning, this job needs only one node), then is there a
> difference between any of these 3 flag combinations:
>
> -n 64 (leaving cpus-per-task to be the default of 1)
> --cpus-per-task  64 (leaving -n to be the default of 1)
> --cpus-per-task 32 -n 2
>
> As far as I can tell, there is no functional difference. But if there is
> even a subtle difference, I would love to know what it is!
>
> Thanks
> David
> --
> Sent from Gmail Mobile
>


-- 
Tom Payerle
DIT-ACIGS/Mid-Atlantic Crossroadspaye...@umd.edu
5825 University Research Park   (301) 405-6135
University of Maryland
College Park, MD 20740-3831


[slurm-users] Help with failing job execution

2022-03-24 Thread Jeffrey R. Lang
My site recently updated to Slurm 21.08.6 and for the most part everything went 
fine.  Two Ubuntu nodes however are having issues.Slurmd cannot execve the 
jobs on the nodes.  As an example:

[jrlang@tmgt1 ~]$ salloc -A ARCC --nodes=1 --ntasks=20 -t 1:00:00 --bell 
--nodelist=mdgx01 --partition=dgx /bin/bash
salloc: Granted job allocation 2328489
[jrlang@tmgt1 ~]$ srun hostname
srun: error: task 0 launch failed: Slurmd could not execve job
srun: error: task 1 launch failed: Slurmd could not execve job
srun: error: task 2 launch failed: Slurmd could not execve job
srun: error: task 3 launch failed: Slurmd could not execve job
srun: error: task 4 launch failed: Slurmd could not execve job
srun: error: task 5 launch failed: Slurmd could not execve job
srun: error: task 6 launch failed: Slurmd could not execve job
srun: error: task 7 launch failed: Slurmd could not execve job
srun: error: task 8 launch failed: Slurmd could not execve job
srun: error: task 9 launch failed: Slurmd could not execve job
srun: error: task 10 launch failed: Slurmd could not execve job
srun: error: task 11 launch failed: Slurmd could not execve job
srun: error: task 12 launch failed: Slurmd could not execve job
srun: error: task 13 launch failed: Slurmd could not execve job
srun: error: task 14 launch failed: Slurmd could not execve job
srun: error: task 15 launch failed: Slurmd could not execve job
srun: error: task 16 launch failed: Slurmd could not execve job
srun: error: task 17 launch failed: Slurmd could not execve job
srun: error: task 18 launch failed: Slurmd could not execve job
srun: error: task 19 launch failed: Slurmd could not execve job

Looking in slurmd-mdgx01.log we only see

[2022-03-24T14:44:02.408] [2328501.interactive] error: Failed to invoke task 
plugins: one of task_p_pre_setuid functions returned error
[2022-03-24T14:44:02.409] [2328501.interactive] error: job_manager: exiting 
abnormally: Slurmd could not execve job
[2022-03-24T14:44:02.411] [2328501.interactive] done with job


Note that this issues didn't occure with Slurm 20.11.8.

Any ideas what could be causing the issue, cause I'm stumped?

Jeff


Re: [slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread David Henkemeyer
“ Will launch 64 instances of your application, each bound to a single cpu”

This is true for srun, but not for sbatch.

A while back, we did an experiment using “hostname” to verify.

On Thu, Mar 24, 2022 at 12:47 PM Ralph Castain  wrote:

> Well, there is indeed a difference - and it is significant.
>
> > On Mar 24, 2022, at 12:32 PM, David Henkemeyer <
> david.henkeme...@gmail.com> wrote:
> >
> > Assuming -N is 1 (meaning, this job needs only one node), then is there
> a difference between any of these 3 flag combinations:
> >
> > -n 64 (leaving cpus-per-task to be the default of 1)
>
> Will launch 64 instances of your application, each bound to a single cpu
>
> > --cpus-per-task  64 (leaving -n to be the default of 1)
>
> Will run ONE instance of your application (no binding if the node has 64
> cpus - otherwise, the proc will be bound to 64 cpu's)
>
> > --cpus-per-task 32 -n 2
>
> Will run TWO instances of your application, each bound to 32 cpu's
>
> >
> > As far as I can tell, there is no functional difference. But if there is
> even a subtle difference, I would love to know what it is!
> >
> > Thanks
> > David
> > --
> > Sent from Gmail Mobile
>
>
> --
Sent from Gmail Mobile


Re: [slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread Ralph Castain
Well, there is indeed a difference - and it is significant.

> On Mar 24, 2022, at 12:32 PM, David Henkemeyer  
> wrote:
> 
> Assuming -N is 1 (meaning, this job needs only one node), then is there a 
> difference between any of these 3 flag combinations:
> 
> -n 64 (leaving cpus-per-task to be the default of 1)

Will launch 64 instances of your application, each bound to a single cpu

> --cpus-per-task  64 (leaving -n to be the default of 1)

Will run ONE instance of your application (no binding if the node has 64 cpus - 
otherwise, the proc will be bound to 64 cpu's)

> --cpus-per-task 32 -n 2

Will run TWO instances of your application, each bound to 32 cpu's

> 
> As far as I can tell, there is no functional difference. But if there is even 
> a subtle difference, I would love to know what it is!
> 
> Thanks
> David 
> -- 
> Sent from Gmail Mobile




[slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread David Henkemeyer
Assuming -N is 1 (meaning, this job needs only one node), then is there a
difference between any of these 3 flag combinations:

-n 64 (leaving cpus-per-task to be the default of 1)
--cpus-per-task  64 (leaving -n to be the default of 1)
--cpus-per-task 32 -n 2

As far as I can tell, there is no functional difference. But if there is
even a subtle difference, I would love to know what it is!

Thanks
David
-- 
Sent from Gmail Mobile


Re: [slurm-users] Make sacct show short job state codes?

2022-03-24 Thread Ole Holm Nielsen
Here is an example command for getting parseable output from sacct of all 
completed jobs during a specific period of time:


$ sacct -p -X -a -S 032322 -E 032422 -o JobID,User,State -s ca,cd,f,to,pr,oom

The fields are separated by | and can easily be parsed by awk.

Example output:

JobID|User|State|
4753873_126|catscr|TIMEOUT|
4753873_129|catscr|TIMEOUT|
4753873_136|catscr|FAILED|

I hope this helps.

/Ole

On 3/24/22 14:47, Brian Andrus wrote:

I don't think that is part of sacct options. Feature request maybe.

Meanwhile, awk would be your friend here. Just post-process by piping the 
output to awk and doing the substitutions before printing the output.


eg:

     sacct  |awk '{sub("CANCELLED","CA");sub("RUNNING","RU");print}'

Just add a 'sub' command for each substitution. It is tedious to setup but 
will do the trick. You can also specify the specific field to do any 
substitution on.


Brian Andrus

On 3/24/2022 6:12 AM, Chip Seraphine wrote:
I’m trying to shave a few columns off the output of some sacct output, 
and while it will happily accept the short codes (e.g. CA instead of 
CANCELLED) I can’t find a way to get it to report them.  Shaving down 
the columns using %N in –format just results in a truncated version of 
the long code, which is often not the same thing.


Does anyone know if/how this can be done?




Re: [slurm-users] srun and --cpus-per-task

2022-03-24 Thread Hermann Schwärzler

Hi Durai,

I see the same thing as you on our test-cluster that has
ThreadsPerCore=2
configured in slurm.conf

The double-foo goes away with this:
srun --cpus-per-task=1 --hint=nomultithread echo foo

Having multithreading enabled leads to imho surprising behaviour of 
Slurm. My impression is that using it makes the concept of "a CPU" in 
Slurm somewhat fuzzy. It becomes unclear and ambiguous what you get when 
using the cpu-related options of srun, sbatch and salloc: is it a 
CPU-core or is it a CPU-thread?


I think what you found is a bug.

If you run

for c in {4..1}
do
 echo "## $c ###"
 srun -c $c bash -c 'echo $SLURM_CPU_BIND_LIST'
done

you will get:

## 4 ###
0x003003
## 3 ###
0x003003
## 2 ###
0x001001
## 1 ###
0x01,0x001000
0x01,0x001000

You see: requesting 4 and 3 CPUs results in the same cpu-binding as both 
need two CPU-cores with 2 threads each. In the "3" case one of it stays 
unused but of course is not free for another job.
In the "1" case I would expect to see the same binding as in the "2" 
case. If you combine the two values in the list you *do* get the same 
value but obviously it's a list of two values and this might be the 
origin of the problem.


It is probably related to what's mentioned in the documentation for 
'--ntasks':
"[...] The default is one task per node, but note that the 
--cpus-per-task option will change this default."


Regards
Hermann

On 3/24/22 1:37 PM, Durai Arasan wrote:

Hello Slurm users,

We are experiencing strange behavior with srun executing commands twice 
only when setting --cpus-per-task=1


$ srun --cpus-per-task=1 --partition=gpu-2080ti echo foo
srun: job 1298286 queued and waiting for resources
srun: job 1298286 has been allocated resources
foo
foo

This is not seen when --cpus-per-task is another value:

$ srun --cpus-per-task=3 --partition=gpu-2080ti echo foo
srun: job 1298287 queued and waiting for resources
srun: job 1298287 has been allocated resources
foo

Also when specifying --ntasks:
$ srun -n1 --cpus-per-task=1 --partition=gpu-2080ti echo foo
srun: job 1298288 queued and waiting for resources
srun: job 1298288 has been allocated resources
foo

Relevant slurm.conf settings are:
SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory
# example node configuration
NodeName=slurm-bm-58 NodeAddr=xxx.xxx.xxx.xxx Procs=72 Sockets=2 
CoresPerSocket=18 ThreadsPerCore=2 RealMemory=354566 
Gres=gpu:rtx2080ti:8 Feature=xx_v2.38 State=UNKNOWN


On closer of job variables in the "--cpus-per-task=1" case, the 
following variables have wrongly acquired a value of 2 for no reason:

SLURM_NTASKS=2
SLURM_NPROCS=2
SLURM_TASKS_PER_NODE=2
SLURM_STEP_NUM_TASKS=2
SLURM_STEP_TASKS_PER_NODE=2

Can you see what could be wrong?

Best,
Durai




Re: [slurm-users] Make sacct show short job state codes?

2022-03-24 Thread Brian Andrus

I don't think that is part of sacct options. Feature request maybe.

Meanwhile, awk would be your friend here. Just post-process by piping 
the output to awk and doing the substitutions before printing the output.


eg:

    sacct  |awk '{sub("CANCELLED","CA");sub("RUNNING","RU");print}'

Just add a 'sub' command for each substitution. It is tedious to setup 
but will do the trick. You can also specify the specific field to do any 
substitution on.


Brian Andrus

On 3/24/2022 6:12 AM, Chip Seraphine wrote:

I’m trying to shave a few columns off the output of some sacct output, and 
while it will happily accept the short codes (e.g. CA instead of CANCELLED) I 
can’t find a way to get it to report them.  Shaving down the columns using %N 
in –format just results in a truncated version of the long code, which is often 
not the same thing.

Does anyone know if/how this can be done?

--

Chip Seraphine
Linux Admin (Grid)

E: cseraph...@drwholdings.com
M: 773 412 2608

This e-mail and any attachments may contain information that is confidential 
and proprietary and otherwise protected from disclosure. If you are not the 
intended recipient of this e-mail, do not read, duplicate or redistribute it by 
any means. Please immediately delete it and any attachments and notify the 
sender that you have received it by mistake. Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail or any 
attachments. The DRW Companies make no representations that this e-mail or any 
attachments are free of computer viruses or other defects.




Re: [slurm-users] Make sacct show short job state codes?

2022-03-24 Thread Ole Holm Nielsen

Hi Chip,

Use the sacct -p or --parsable option to get the complete output delimited 
by |


/Ole


On 3/24/22 14:12, Chip Seraphine wrote:

I’m trying to shave a few columns off the output of some sacct output, and 
while it will happily accept the short codes (e.g. CA instead of CANCELLED) I 
can’t find a way to get it to report them.  Shaving down the columns using %N 
in –format just results in a truncated version of the long code, which is often 
not the same thing.

Does anyone know if/how this can be done?




[slurm-users] Make sacct show short job state codes?

2022-03-24 Thread Chip Seraphine
I’m trying to shave a few columns off the output of some sacct output, and 
while it will happily accept the short codes (e.g. CA instead of CANCELLED) I 
can’t find a way to get it to report them.  Shaving down the columns using %N 
in –format just results in a truncated version of the long code, which is often 
not the same thing.

Does anyone know if/how this can be done?

--

Chip Seraphine
Linux Admin (Grid)

E: cseraph...@drwholdings.com
M: 773 412 2608

This e-mail and any attachments may contain information that is confidential 
and proprietary and otherwise protected from disclosure. If you are not the 
intended recipient of this e-mail, do not read, duplicate or redistribute it by 
any means. Please immediately delete it and any attachments and notify the 
sender that you have received it by mistake. Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail or any 
attachments. The DRW Companies make no representations that this e-mail or any 
attachments are free of computer viruses or other defects.


[slurm-users] srun and --cpus-per-task

2022-03-24 Thread Durai Arasan
Hello Slurm users,

We are experiencing strange behavior with srun executing commands twice
only when setting --cpus-per-task=1

$ srun --cpus-per-task=1 --partition=gpu-2080ti echo foo
srun: job 1298286 queued and waiting for resources
srun: job 1298286 has been allocated resources
foo
foo

This is not seen when --cpus-per-task is another value:

$ srun --cpus-per-task=3 --partition=gpu-2080ti echo foo
srun: job 1298287 queued and waiting for resources
srun: job 1298287 has been allocated resources
foo

Also when specifying --ntasks:
$ srun -n1 --cpus-per-task=1 --partition=gpu-2080ti echo foo
srun: job 1298288 queued and waiting for resources
srun: job 1298288 has been allocated resources
foo

Relevant slurm.conf settings are:
SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory
# example node configuration
NodeName=slurm-bm-58 NodeAddr=xxx.xxx.xxx.xxx Procs=72 Sockets=2
CoresPerSocket=18 ThreadsPerCore=2 RealMemory=354566 Gres=gpu:rtx2080ti:8
Feature=xx_v2.38 State=UNKNOWN

On closer of job variables in the "--cpus-per-task=1" case, the following
variables have wrongly acquired a value of 2 for no reason:
SLURM_NTASKS=2
SLURM_NPROCS=2
SLURM_TASKS_PER_NODE=2
SLURM_STEP_NUM_TASKS=2
SLURM_STEP_TASKS_PER_NODE=2

Can you see what could be wrong?

Best,
Durai


Re: [slurm-users] how to locate the problem when slurm failed to restrict gpu usage of user jobs

2022-03-24 Thread Sean Maxwell
cgroups can control access to devices (e.g. /dev/nvidia0), which is how I
understand it to work.

-Sean

On Thu, Mar 24, 2022 at 4:27 AM  wrote:

> Well, this is indeed the point. We didn’t set *ConstrainDevices=yes *in
> cgroup.conf. After adding this, gpu restriction works as expected.
>
> But what is the relation between gpu restriction and cgroup? I never heard
> that cgroup can limit gpu card usage. Isn’t it a feature of cuda or nvidia
> driver?
>
>
>
> *发件人:* Sean Maxwell 
> *发送时间:* 2022年3月23日 23:05
> *收件人:* Slurm User Community List 
> *主题:* Re: [slurm-users] how to locate the problem when slurm failed to
> restrict gpu usage of user jobs
>
>
>
> Hi,
>
>
>
> If you are using cgroups for task/process management, you should verify
> that your /etc/slurm/cgroup.conf has the following line:
>
>
>
> ConstrainDevices=yes
>
>
>
> I'm not sure about the missing environment variable, but the absence of
> the above in cgroup.conf is one way the GPU devices can be unconstrained in
> the jobs.
>
>
>
> -Sean
>
>
>
>
>
>
>
> On Wed, Mar 23, 2022 at 10:46 AM  wrote:
>
> Hi, all:
>
>
>
> We found a problem that slurm job with argument such as *--gres gpu:1 *
> didn’t be restricted with gpu usage, user still can see all gpu card on
> allocated nodes.
>
> Our gpu node has 4 cards with their gres.conf to be:
>
> > cat /etc/slurm/gres.conf
>
> Name=gpu Type=NVlink_A100_40GB File=/dev/nvidia0 CPUs=0-15
>
> Name=gpu Type=NVlink_A100_40GB File=/dev/nvidia1 CPUs=16-31
>
> Name=gpu Type=NVlink_A100_40GB File=/dev/nvidia2 CPUs=32-47
>
> Name=gpu Type=NVlink_A100_40GB File=/dev/nvidia3 CPUs=48-63
>
>
>
> And for test, we submit simple job batch like:
>
> #!/bin/bash
>
> #SBATCH --job-name=test
>
> #SBATCH --partition=a100
>
> #SBATCH --nodes=1
>
> #SBATCH --ntasks=6
>
> #SBATCH --gres=gpu:1
>
> #SBATCH --reservation="gpu test"
>
> hostname
>
> nvidia-smi
>
> echo end
>
>
>
> Then in the out file the nvidia-smi showed all 4 gpu cards. But we expect
> to see only 1 allocated gpu card.
>
>
>
> Official document of slurm said it will set *CUDA_VISIBLE_DEVICES *env
> var to restrict the gpu card available to user. But we didn’t find such
> variable exists in job environment. We only confirmed it do exist in prolog
> script environment by adding debug command “echo $CUDA_VISIBLE_DEVICES”
> to slurm prolog script.
>
>
>
> So how do slurm co-operate with nvidia tools to make job user only see its
> allocated gpu card? What is the requirement on nvidia gpu drivers, CUDA
> toolkit or any other part to help slurm correctly restrict the gpu usage?
>
>


[slurm-users] 答复: how to locate the problem when slurm failed to restrict gpu usage of user jobs

2022-03-24 Thread taleintervenor
Well, this is indeed the point. We didn’t set ConstrainDevices=yes in 
cgroup.conf. After adding this, gpu restriction works as expected.

But what is the relation between gpu restriction and cgroup? I never heard that 
cgroup can limit gpu card usage. Isn’t it a feature of cuda or nvidia driver? 

 

发件人: Sean Maxwell  
发送时间: 2022年3月23日 23:05
收件人: Slurm User Community List 
主题: Re: [slurm-users] how to locate the problem when slurm failed to restrict 
gpu usage of user jobs

 

Hi,

 

If you are using cgroups for task/process management, you should verify that 
your /etc/slurm/cgroup.conf has the following line:

 

ConstrainDevices=yes

 

I'm not sure about the missing environment variable, but the absence of the 
above in cgroup.conf is one way the GPU devices can be unconstrained in the 
jobs.

 

-Sean

 

 

 

On Wed, Mar 23, 2022 at 10:46 AM mailto:taleinterve...@sjtu.edu.cn> > wrote:

Hi, all:

 

We found a problem that slurm job with argument such as --gres gpu:1 didn’t be 
restricted with gpu usage, user still can see all gpu card on allocated nodes.

Our gpu node has 4 cards with their gres.conf to be:

> cat /etc/slurm/gres.conf

Name=gpu Type=NVlink_A100_40GB File=/dev/nvidia0 CPUs=0-15

Name=gpu Type=NVlink_A100_40GB File=/dev/nvidia1 CPUs=16-31

Name=gpu Type=NVlink_A100_40GB File=/dev/nvidia2 CPUs=32-47

Name=gpu Type=NVlink_A100_40GB File=/dev/nvidia3 CPUs=48-63

 

And for test, we submit simple job batch like:

#!/bin/bash

#SBATCH --job-name=test

#SBATCH --partition=a100

#SBATCH --nodes=1

#SBATCH --ntasks=6

#SBATCH --gres=gpu:1

#SBATCH --reservation="gpu test"

hostname

nvidia-smi

echo end

 

Then in the out file the nvidia-smi showed all 4 gpu cards. But we expect to 
see only 1 allocated gpu card.

 

Official document of slurm said it will set CUDA_VISIBLE_DEVICES env var to 
restrict the gpu card available to user. But we didn’t find such variable 
exists in job environment. We only confirmed it do exist in prolog script 
environment by adding debug command “echo $CUDA_VISIBLE_DEVICES” to slurm 
prolog script.

 

So how do slurm co-operate with nvidia tools to make job user only see its 
allocated gpu card? What is the requirement on nvidia gpu drivers, CUDA toolkit 
or any other part to help slurm correctly restrict the gpu usage?