from:"Davide DelVento"

[slurm-users] Re: Can SLURM queue different jobs to start concurrently?

2024-07-08 Thread Davide DelVento via slurm-users

I think the best way to do it would be to schedule the 10 things to be a
single slurm job and then use some of the various MPMD ways (the nitty
gritty details depend if each executable is serial, OpenMP, MPI or hybrid).

On Mon, Jul 8, 2024 at 2:20 PM Dan Healy via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> Hi there,
>
> I've received a question from an end user, which I presume the answer is
> "No", but would like to ask the community first.
>
> Scenario: The user wants to create a series of jobs that all need to start
> at the same time. Example: there are 10 different executable applications
> which have varying CPU and RAM constraints, all of which need to
> communicate via TCP/IP. Of course the user could design some type of
> idle/statusing mechanism to wait until all jobs are *randomly *started,
> then begin execution, but this feels like a waste of resources. The
> complete execution of these 10 applications would be considered a single
> simulation. The goal would be to distribute these 10 applications across
> the cluster and not necessarily require them all to execute on a single
> node.
>
> Is there a good architecture for this using SLURM? If so, please kindly
> point me in the right direction.
>
> --
> Thanks,
>
> Daniel Healy
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: Best practice for jobs resuming from suspended state

2024-05-16 Thread Davide DelVento via slurm-users

I don't really have an answer for you, just responding to make your message
pop out in the "flood" of other topics we've got since you posted.

On our cluster we configure cancelling our jobs because it makes more sense
for our situation, so I have no experience with that resume from being
suspended. I can think of two possible reasons for this:

- one is memory (have you checked your memory logs and see if there is a
correlation between node memory occupation and jobs not resuming correctly)
- the second one is some resources disappearing (temp files? maybe in some
circumstances slurm totally wipes out /tmp the second job -- if so, that
would be a slurm bug, obviously)

Assuming that you're stuck without finding a root cause which you can
address, I guess it depends on what "doesn't recover" means. It's one thing
if it crashes immediately. It's another if it just stalls without even
starting but slurm still thinks it's running and the users are charged
their allocation -- even worse if your cluster does not enforce a
wallclock limit (or has a very long one). Depending on frequency of the
issue, size of your cluster and other conditions, you may want to consider
writing a watchdog script which would search for these jobs and cancel them?

As I said, not really an answer, just my $0.02 cents (or even less)

On Wed, May 15, 2024 at 1:54 AM Paul Jones via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> Hi,
>
> We use PreemptMode and PriorityTier within Slurm to suspend low priority
> jobs when more urgent work needs to be done. This generally works well, but
> on occasion resumed jobs fail to restart - which is to say Slurm sets the
> job status to running but the actual code doesn't recover from being
> suspended.
>
> Technically everything is working as expected, but I wondered if there was
> any best practice to pass onto users about how to cope with this state?
> Obviously not a direct Slurm question, but wondered if others had
> experience with this and any advice on how best to limit the impact?
>
> Thanks,
> Paul
>
> --
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: memory high water mark reporting

2024-05-16 Thread Davide DelVento via slurm-users

Not exactly the answer to your question (which I don't know) but if you can
get to prefix whatever is executed with this
https://github.com/NCAR/peak_memusage (which also uses getrusage) or a
variant you will be able to do that.

On Thu, May 16, 2024 at 4:10 PM Emyr James via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> Hi,
>
> We are trying out slurm having been running grid engine for a long while.
> In grid engine, the cgroups peak memory and max_rss are generated at the
> end of a job and recorded. It logs the information from the cgroup
> hierarchy as well as doing a getrusage call right at the end on the parent
> pid of the whole job "container" before cleaning up.
> With slurm it seems that the only way memory is recorded is by the acct
> gather polling. I am trying to add something in an epilog script to get the
> memory.peak but It looks like the cgroup hierarchy has been destroyed by
> the time the epilog is run.
> Where in the code is the cgroup hierarchy cleared up ? Is there no way to
> add something in so that the accounting is updated during the job cleanup
> process so that peak memory usage can be accurately logged ?
>
> I can reduce the polling interval from 30s to 5s but don't know if this
> causes a lot of overhead and in any case this seems to not be a sensible
> way to get values that should just be determined right at the end by an
> event rather than using polling.
>
> Many thanks,
>
> Emyr
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: Partition Preemption Configuration Question

2024-05-08 Thread Davide DelVento via slurm-users

{
  "emoji": "",
  "version": 1
}
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: StateSaveLocation and Slurm HA

2024-05-07 Thread Davide DelVento via slurm-users

Are you seeking something simple rather than sophisticated? If so, you can
use the controller local disk for StateSaveLocation and place a cron job
(on the same node or somewhere else) to take that data out via e.g. rsync
and put it where you need it (NFS?) for the backup control node to use
if/when needed. That obviously introduces a time delay which might or might
not be problematic depending on what kind of failures you are trying to
protect from and with what level of guarantee you wish the HA would have:
you will not be protected in every possible scenario. On the other hand,
given the size of the cluster that might be adequate and it's basically
zero effort, so it might be "good enough" for you.

On Tue, May 7, 2024 at 4:44 AM Pierre Abele via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> Hi all,
>
> I am looking for a clean way to set up Slurms native high availability
> feature. I am managing a Slurm cluster with one control node (hosting
> both slurmctld and slurmdbd), one login node and a few dozen compute
> nodes. I have a virtual machine that I want to set up as a backup
> control node.
>
> The Slurm documentation says the following about the StateSaveLocation
> directory:
>
> > The directory used should be on a low-latency local disk to prevent file
> system delays from affecting Slurm performance. If using a backup host, the
> StateSaveLocation should reside on a file system shared by the two hosts.
> We do not recommend using NFS to make the directory accessible to both
> hosts, but do recommend a shared mount that is accessible to the two
> controllers and allows low-latency reads and writes to the disk. If a
> controller comes up without access to the state information, queued and
> running jobs will be cancelled. [1]
>
> My question: How do I implement the shared file system for the
> StateSaveLocation?
>
> I do not want to introduce a single point of failure by having a single
> node that hosts the StateSaveLocation, neither do I want to put that
> directory on the clusters NFS storage since outages/downtime of the
> storage system will happen at some point and I do not want that to cause
> an outage of the Slurm controller.
>
> Any help or ideas would be appreciated.
>
> Best,
> Pierre
>
>
> [1] https://slurm.schedmd.com/quickstart_admin.html#Config
>
> --
> Pierre Abele, M.Sc.
>
> HPC Administrator
> Max-Planck-Institute for Evolutionary Anthropology
> Department of Primate Behavior and Evolution
>
> Deutscher Platz 6
> 04103 Leipzig
>
> Room: U2.80
> E-Mail: pierre_ab...@eva.mpg.de
> Phone: +49 (0) 341 3550 245
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: Partition Preemption Configuration Question

2024-05-02 Thread Davide DelVento via slurm-users

Hi Jason,

I wanted exactly the same and was confused exactly like you. For a while it
did not work, regardless of what I tried, but eventually (with some help) I
figured it out.

What I set up and it is working fine is this globally

PreemptType = preempt/partition_prio
PreemptMode=REQUEUE

and then individually each partition definition has either PreemptMode=off
or PreemptMode=cancel

It took me a while to make it work, and the problem in my case was that I
did not include the requeue line because (as I am describing) I did not
want requeue, but without that line slurm preemption simply would not work.
Since it's overridden in each partition, then it works as if it's not
there, but it must be there. Very simple once you know it.

Hope this helps

On Thu, May 2, 2024 at 9:16 AM Jason Simms via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> Hello all,
>
> The Slurm docs have me a bit confused... I'm wanting to enable job
> preemption on certain partitions but not others. I *presume* I would
> set PreemptType=preempt/partition_prio globally, but then on the partitions
> where I don't want jobs to be able to be preempted, I would set
> PreemptMode=off within the configuration for that specific partition.
>
> The documentation, however, says that setting PreemptMode=off at a
> partition level "is only compatible with PreemptType=preempt/none at a
> global level" yet then immediately says that doing so is a "common use case
> for this parameter is to set it on a partition to disable preemption for
> that partition," which indicates preemption would still be allowable for
> other partitions.
>
> If PreemptType is set to preempt/none globally, and I *cannot* set that as
> an option for a given partition (at least, the documentation doesn't
> indicate that is a valid parameter for a partition), wouldn't preemption be
> disabled globally anyway? The wording seems odd to me and almost
> contradictory.
>
> Is it possible to have PreemptType=preempt/partition_prio set globally,
> yet also disable it on specific partitions with PreemptMode=off? Is
> PreemptType actually a valid configuration option for specific partitions?
>
> Thanks for any guidance.
>
> Warmest regards,
> Jason
>
> --
> *Jason L. Simms, Ph.D., M.P.H.*
> Manager of Research Computing
> Swarthmore College
> Information Technology Services
> (610) 328-8102
> Schedule a meeting: https://calendly.com/jlsimms
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: Recover Batch Script Error

2024-02-16 Thread Davide DelVento via slurm-users

Yes, that is what we are also doing and it works well.
Note that requesting a batch script for another user, one sees nothing
(rather than an error message saying that one does not have permissions)

On Fri, Feb 16, 2024 at 12:48 PM Paul Edmon via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> Are you using the job_script storage option? If so then you should be able
> to get at it by doing:
>
> sacct -B j JOBID
>
> https://slurm.schedmd.com/sacct.html#OPT_batch-script
>
> -Paul Edmon-
> On 2/16/2024 2:41 PM, Jason Simms via slurm-users wrote:
>
> Hello all,
>
> I've used the "scontrol write batch_script" command to output the job
> submission script from completed jobs in the past, but for some reason, no
> matter which job I specify, it tells me it is invalid. Any way to
> troubleshoot this? Alternatively, is there another way - even if a manual
> database query - to recover the job script, assuming it exists in the
> database?
>
> sacct --jobs=38960
> JobID   JobName  PartitionAccount  AllocCPUS  State
> ExitCode
>  -- -- -- -- --
> 
> 38960amr_run_v+ tsmith2lab tsmith2lab 72  COMPLETED
>  0:0
> 38960.batch   batchtsmith2lab 40  COMPLETED
>  0:0
> 38960.extern externtsmith2lab 72  COMPLETED
>  0:0
> 38960.0  hydra_pmi+tsmith2lab 72  COMPLETED
>  0:0
>
> scontrol write batch_script 38960
> job script retrieval failed: Invalid job id specified
>
> Warmest regards,
> Jason
>
> --
> *Jason L. Simms, Ph.D., M.P.H.*
> Manager of Research Computing
> Swarthmore College
> Information Technology Services
> (610) 328-8102
> Schedule a meeting: https://calendly.com/jlsimms
>
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: Need help managing licence

2024-02-16 Thread Davide DelVento via slurm-users

The simple answer is to just add a line such as
Licenses=whatever:20

and then request your users to use the -L option as described at

https://slurm.schedmd.com/licenses.html

This works very well, however it does not do enforcement like Slurm does
with other resources. You will find posts in this list from me trying to
achieve such enforcement with prolog, but I ended up banging too much my
head on the keyboard and so I eventually gave up. User education was easier
for me. Depending on your user community, banging your head on the keyboard
might be easier than educating your users -- if so please share how you
solve the issue

On Fri, Feb 16, 2024 at 7:48 AM Sylvain MARET via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> Hello everyone !
>
> Recently our users bought a cplex dynamic license and want to use it on
> our slurm cluster.
> I've installed the paid version of cplex within modules so authorized
> user can load it with a simple module load cplex/2111 command but I
> don't know how to manage and ensure slurm doesn't launch a job if 20
> people are already running code with this license.
>
> How do you guys manage paid licenses on your cluster ? Any advice would
> be appreciated !
>
> Regards,
> Sylvain Maret
>
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: Compilation question

2024-02-09 Thread Davide DelVento via slurm-users

Hi Sylvain,
For the series better late than never, is this still a problem?
If so, is this a new install or an update?
Whan environment/compiler are you using? The error

undefined reference to `__nv_init_env'

seems to indicate that you are doing something cuda-related which I think
you should not be doing?

In any case, most people run on a RHEL (or compatible) distro and use
rpmbuild rather than straight configure/make, e.g. a variant of what is
described at https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_installation/

Hope this helps,


On Wed, Jan 17, 2024 at 8:36 AM Sylvain MARET 
wrote:

> Hello everyone !
>
> I'm trying to compile slurm 22.05.11 on Rocky linux 8.7 with freeipmi
> support
>
> I've seen the documentation so I've done the configure step :
>
> ./configure --with-pmix=$PMIXHOME --with-ucx=$UCXHOME
> --with-nvml=$NVMLHOME --prefix=$SLURMHOME --with-freeipmi=/usr
>
> but when I run make I end up with the following error :
>
> /bin/sh ../../../../../libtool  --tag=CC   --mode=link gcc
> -DNUMA_VERSION1_COMPATIBILITY -g -O2 -fno-omit-frame-pointer -pthread
> -ggdb3 -Wall -g -O1 -fno-strict-aliasing -export-dynamic -L/usr/lib64
> -lhdf5_hl -lhdf5  -lsz -lz -ldl -lm  -o sh5util sh5util.o
> -Wl,-rpath=/softs/batch/slurm/22.05.11/lib/slurm
> -L../../../../../src/api/.libs -lslurmfull -ldl ../libhdf5_api.la
> -lpthread -lm -lresolv
> libtool: link: gcc -DNUMA_VERSION1_COMPATIBILITY -g -O2
> -fno-omit-frame-pointer -pthread -ggdb3 -Wall -g -O1
> -fno-strict-aliasing -o .libs/sh5util sh5util.o
> -Wl,-rpath=/softs/batch/slurm/22.05.11/lib/slurm -Wl,--export-dynamic
> -L/usr/lib64 -L../../../../../src/api/.libs
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so
> ../.libs/libhdf5_api.a -lhdf5_hl -lhdf5 -lsz -lz -ldl -lpthread -lm
> -lresolv -pthread -Wl,-rpath -Wl,/softs/batch/slurm/22.05.11/lib/slurm
> sh5util.o:(.init_array+0x0): undefined reference to `__nv_init_env'
> sh5util.o:(.init_array+0x8): undefined reference to `__flushz'
> sh5util.o:(.init_array+0x10): undefined reference to `__daz'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_list_transfer_unique'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_sort_key_pairs'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_xstrchr'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_unsetenvp'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_list_sort'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_list_for_each'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `__builtin__pgi_isnanld'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_get_extra_conf_path'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `__blt_pgi_ctzll'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_running_in_slurmctld'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `__c_mcopy1'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `__blt_pgi_clzll'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_list_create'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_list_count'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `__builtin_va_gparg1'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_destroy_config_key_pair'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_xfree_ptr'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_getenvp'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_free_buf'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_get_log_level'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `__c_mset8'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_xstrdup_printf'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_list_delete_first'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_list_append'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_error'
> /softs/batch/slurm/slurm-22.05.11/src/api/.libs/libslurmfull.so:
> undefined reference to `slurm_init_buf'
>

[slurm-users] Re: Memory used per node

2024-02-09 Thread Davide DelVento via slurm-users

If you would like the high watermark memory utilization after the job
completes, https://github.com/NCAR/peak_memusage is a great tool. Of course
it has the limitation that you need to know that you want that information
*before* starting the job, which might or might not a problem for your use
case

On Fri, Feb 9, 2024 at 10:07 AM Gerhard Strangar via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> Hello,
>
> I'm wondering if there's a way to tell how much memory my job is using
> per node. I'm doing
>
> #SBATCH -n 256
> srun solver inputfile
>
> When I run sacct -o maxvmsize, the result apparently is the maxmimum VSZ
> of the largest solver process, not the maximum of the sum of them all
> (unlike when calling mpirun instead). When I sstat -o TresUsageInMax, I
> get the memory summed up over all nodes being used. Can I get the
> maximum VSZ per node?
>
>
> Gerhard
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Re: [slurm-users] propose environment variables SLURM_STDOUT, SLURM_STDERR, SLURM_STDIN

2024-01-22 Thread Davide DelVento

I think it would be useful, yes, and mostly for the epilog script.

In the job script itself, you are creating such files, so some of the
proposed use cases are a bit tricky to get right in the way you described
them. For example, if you scp these files, you are scp'ing them to their
status before scp is run. Something else might happen (e.g. scp warnings)
which will be added to the files after the command is run, and those would
not be included. Also, the buffers might not have flushed, so the scp'ed
version can be incomplete. Even worse for post-processing, which can be
covered better with something like the following in the slurm script

program_that_creates_lots_of_output | tee
full_output_just_in_case_is_needed.txt | post_processing_script

So that the original slurm file will automatically contain the
post-processing version, and the "just in case" file will contain the full
log. Of course the name of the latter does not need to be hardcoded and can
use things like $SLURM_JOB_ID to make it unique for each job.

On Mon, Jan 22, 2024 at 12:11 AM Bjørn-Helge Mevik 
wrote:

> I would find that useful, yes.  Especially if the variables were made
> available for the Prolog and Epilog scripts.
>
> --
> Regards,
> Bjørn-Helge Mevik, dr. scient,
> Department for Research Computing, University of Oslo
>
>

Re: [slurm-users] preemptable queue

2024-01-12 Thread Davide DelVento

Thanks Paul for taking the time to further look into this. In fact you are
correct and adding a default mode (which is then overridden by each
partition setting) keeps slurm happy with that configuration. Moreover
(after restarting daemons, etc per the documentation) everything seems to
be working as I intended. I obviously need to do a few more tests,
especially for edge cases, but adding that default seems to have completely
fixed the problem.

Thanks again and have a great weekend!


On Fri, Jan 12, 2024 at 8:49 AM Paul Edmon  wrote:

> My concern was you config inadvertantly having that line commented out and
> then seeing problems. If it wasn't then no worries at this point.
>
> We run using preempt/partition_prio on our cluster and have a mix of
> partitions using PreemptMode=OFF and PreemptMode=REQUEUE. So I know that
> combination works. I would be surprised if PreemptMode=CANCEL did not work
> as that's a valid option.
>
> Something we do have set though is what the default mode is. We have set:
>
> ### Govern's default preemption behavior
> PreemptType=preempt/partition_prio
> PreemptMode=REQUEUE
>
> So you might try setting that default of PreemptMode=CANCEL and then set
> specific PreemptModes for all your partitions. That's what we do and it
> works for us.
>
> -Paul Edmon-
> On 1/12/2024 10:33 AM, Davide DelVento wrote:
>
> Thanks Paul,
>
> I don't understand what you mean by having a typo somewhere. I mean, that
> configuration works just fine right now, whereas if I add the commented out
> line any slurm command will just abort with the error "PreemptType and
> PreemptMode values incompatible". So, assuming there is a typo, it should
> be in the commented line right? Or are you saying that having that line
> makes slurm sensitive to a typo somewhere else that would be otherwise
> ignored? Obviously I can't exclude that option, but it seems unlikely to
> me. Also because it does say these two things are incompatible.
>
> It would obviously much better if the error would say what EXACTLY is
> incompatible with what, but the documentation at
> https://slurm.schedmd.com/preempt.html I see many clues of what that
> could be, and hence I am asking people here who may have deployed
> preemption already on their system. Some excerpts from that URL:
>
>
> *PreemptType*: Specifies the plugin used to identify which jobs can be
> preempted in order to start a pending job.
>
>- *preempt/none*: Job preemption is disabled (default).
>- *preempt/partition_prio*: Job preemption is based upon partition
>*PriorityTier*. Jobs in higher PriorityTier partitions may preempt
>jobs from lower PriorityTier partitions. This is not compatible with
>*PreemptMode=OFF*.
>
>
> which somewhat make it sounds like all partitions should have preemption
> set and not only some? I obviously have some "off" partitions. However
> elsewhere in that document it says
>
> *PreemptMode*: Mechanism used to preempt jobs or enable gang scheduling.
> When the *PreemptType* parameter is set to enable preemption, the
> *PreemptMode* in the main section of slurm.conf selects the default
> mechanism used to preempt the preemptable jobs for the cluster.
> *PreemptMode* may be specified on a per partition basis to override this
> default value if *PreemptType=preempt/partition_prio*.
>
> which kind of sounds like it should be okay (unless it means
> **everything** must be different than OFF). Yet still elsewhere in that
> same page it says
>
> On the other hand, if you want to use *PreemptType=preempt/partition_prio* to
> allow jobs from higher PriorityTier partitions to Suspend jobs from lower
> PriorityTier partitions, then you will need overlapping partitions, and
> *PreemptMode=SUSPEND,GANG* to use Gang scheduler to resume the suspended
> job(s). In either case, time-slicing won't happen between jobs on different
> partitions.
>
> Which somewhat sounds like only suspend and gang can be used as preemption
> modes, and not cancel (my preference) or requeue (perhaps acceptable, if I
> jump through some hoops).
>
> So to me the documentation is highly confusing about what can or cannot be
> used together with what else, and the examples at the bottom of the page
> are nice, but they do not specify the full settings. Particularly this one
> https://slurm.schedmd.com/preempt.html#example2 is close enough to mine,
> but it does not tell what PreemptType has been chosen (nor if "cancel"
> would be allowed or not in that setup).
>
> Thanks again!
>
> On Fri, Jan 12, 2024 at 7:22 AM Paul Edmon  wrote:
>
>> At least in the example you are showing you have PreemptType commented
>> out, which means it will return the default. PreemptMode

Re: [slurm-users] preemptable queue

2024-01-12 Thread Davide DelVento

Thanks Paul,

I don't understand what you mean by having a typo somewhere. I mean, that
configuration works just fine right now, whereas if I add the commented out
line any slurm command will just abort with the error "PreemptType and
PreemptMode values incompatible". So, assuming there is a typo, it should
be in the commented line right? Or are you saying that having that line
makes slurm sensitive to a typo somewhere else that would be otherwise
ignored? Obviously I can't exclude that option, but it seems unlikely to
me. Also because it does say these two things are incompatible.

It would obviously much better if the error would say what EXACTLY is
incompatible with what, but the documentation at
https://slurm.schedmd.com/preempt.html I see many clues of what that could
be, and hence I am asking people here who may have deployed preemption
already on their system. Some excerpts from that URL:

*PreemptType*: Specifies the plugin used to identify which jobs can be
preempted in order to start a pending job.

   - *preempt/none*: Job preemption is disabled (default).
   - *preempt/partition_prio*: Job preemption is based upon partition
   *PriorityTier*. Jobs in higher PriorityTier partitions may preempt jobs
   from lower PriorityTier partitions. This is not compatible with
   *PreemptMode=OFF*.

which somewhat make it sounds like all partitions should have preemption
set and not only some? I obviously have some "off" partitions. However
elsewhere in that document it says

*PreemptMode*: Mechanism used to preempt jobs or enable gang scheduling.
When the *PreemptType* parameter is set to enable preemption, the
*PreemptMode* in the main section of slurm.conf selects the default
mechanism used to preempt the preemptable jobs for the cluster.
*PreemptMode* may be specified on a per partition basis to override this
default value if *PreemptType=preempt/partition_prio*.

which kind of sounds like it should be okay (unless it means **everything**
must be different than OFF). Yet still elsewhere in that same page it says

On the other hand, if you want to use *PreemptType=preempt/partition_prio* to
allow jobs from higher PriorityTier partitions to Suspend jobs from lower
PriorityTier partitions, then you will need overlapping partitions, and
*PreemptMode=SUSPEND,GANG* to use Gang scheduler to resume the suspended
job(s). In either case, time-slicing won't happen between jobs on different
partitions.

Which somewhat sounds like only suspend and gang can be used as preemption
modes, and not cancel (my preference) or requeue (perhaps acceptable, if I
jump through some hoops).

So to me the documentation is highly confusing about what can or cannot be
used together with what else, and the examples at the bottom of the page
are nice, but they do not specify the full settings. Particularly this one
https://slurm.schedmd.com/preempt.html#example2 is close enough to mine,
but it does not tell what PreemptType has been chosen (nor if "cancel"
would be allowed or not in that setup).

Thanks again!

On Fri, Jan 12, 2024 at 7:22 AM Paul Edmon  wrote:

> At least in the example you are showing you have PreemptType commented
> out, which means it will return the default. PreemptMode Cancel should
> work, I don't see anything in the documentation that indicates it
> wouldn't.  So I suspect you have a typo somewhere in your conf.
>
> -Paul Edmon-
> On 1/11/2024 6:01 PM, Davide DelVento wrote:
>
> I would like to add a preemptable queue to our cluster. Actually I already
> have. We simply want jobs submitted to that queue be preempted if there are
> no resources available for jobs in other (high priority) queues.
> Conceptually very simple, no conditionals, no choices, just what I wrote.
> However it does not work as desired.
>
> This is the relevant part:
>
> grep -i Preemp /opt/slurm/slurm.conf
> #PreemptType = preempt/partition_prio
> PartitionName=regular DefMemPerCPU=4580 Default=True Nodes=node[01-12]
> State=UP PreemptMode=off PriorityTier=200
> PartitionName=All DefMemPerCPU=4580 Nodes=node[01-36] State=UP
> PreemptMode=off PriorityTier=500
> PartitionName=lowpriority DefMemPerCPU=4580 Nodes=node[01-36] State=UP
> PreemptMode=cancel PriorityTier=100
>
>
> That PreemptType setting (now commented) fully breaks slurm, everything
> refuses to run with errors like
>
> $ squeue
> squeue: error: PreemptType and PreemptMode values incompatible
> squeue: fatal: Unable to process configuration file
>
> If I understand correctly the documentation at
> https://slurm.schedmd.com/preempt.html that is because preemption cannot
> cancel jobs based on partition priority, which (if true) is really
> unfortunate. I understand that allowing cross-partition time-slicing could
> be tricky and so I understand why that isn't allowed, but cancelling?
> Anyway, I have to questions:
>
> 1) is

[slurm-users] preemptable queue

2024-01-11 Thread Davide DelVento

I would like to add a preemptable queue to our cluster. Actually I already
have. We simply want jobs submitted to that queue be preempted if there are
no resources available for jobs in other (high priority) queues.
Conceptually very simple, no conditionals, no choices, just what I wrote.
However it does not work as desired.

This is the relevant part:

grep -i Preemp /opt/slurm/slurm.conf
#PreemptType = preempt/partition_prio
PartitionName=regular DefMemPerCPU=4580 Default=True Nodes=node[01-12]
State=UP PreemptMode=off PriorityTier=200
PartitionName=All DefMemPerCPU=4580 Nodes=node[01-36] State=UP
PreemptMode=off PriorityTier=500
PartitionName=lowpriority DefMemPerCPU=4580 Nodes=node[01-36] State=UP
PreemptMode=cancel PriorityTier=100


That PreemptType setting (now commented) fully breaks slurm, everything
refuses to run with errors like

$ squeue
squeue: error: PreemptType and PreemptMode values incompatible
squeue: fatal: Unable to process configuration file

If I understand correctly the documentation at
https://slurm.schedmd.com/preempt.html that is because preemption cannot
cancel jobs based on partition priority, which (if true) is really
unfortunate. I understand that allowing cross-partition time-slicing could
be tricky and so I understand why that isn't allowed, but cancelling?
Anyway, I have to questions:

1) is that correct and so should I avoid using either partition priority or
cancelling?
2) is there an easy way to trick slurm into requeing and then have those
jobs cancelled instead?
3) I guess the cleanest option would be to implement QoS, but I've never
done it and we don't really need it for anything else other than this. The
documentation looks complicated, but is it? The great Ole's website is
unavailable at the moment...

Thanks!!

Re: [slurm-users] Reproducible irreproducible problem (timeout?)

2023-12-20 Thread Davide DelVento

Not an answer to your question, but if the jobs need to be subdivided, why
not submit smaller jobs?

Also, this does not sound like a slurm problem, but rather a code or
infrastructure issue.

Finally, are you typically able to ssh into the main node of each subtask?
In many places that is not allowed and you would get the "Authentication
failed" error regardless... Some places (but definitely not all) allow
instead logging in with something like

srun --jobid  --pty bash

Where obviously  is your job ID. Hope this helps


On Wed, Dec 20, 2023 at 6:34 AM Laurence Marks 
wrote:

> I know that sounds improbable, but please readon.
>
> I am running a reasonably large job on a University supercomputer (not a
> national facility) with 12 nodes on 64 core nodes. The job loops through a
> sequence of commands some of which are single cpu, but with a slow step
> where 3 tasks each with 4 nodes running hybrid omp/mpi are launched. I use
> mpirun for this (Intel impi), which in turn uses srun for each. These slow
> steps run for about 50 minutes. The full job runs for 48 hours, and I am
> typically queueing 11 of these at a time to run in parallel on different
> nodes.
>
> After some (irreproducible) time, often one of the three slow tasks hangs.
> A symptom is that if I try and ssh into the main node of the subtask (which
> is running 128 mpi on the 4 nodes) I get "Authentication failed". Sometimes
> I can kill the mpiexec on the main parent node and this will propagate and
> I can continue (with some fault handling).
>
> I know most people expect a single srun to be used, rather than a complex
> loop as above. The reason is that it is much, much more efficient to
> subdivide the problem, and also code maintenance is better with
> subproblems. This is an established code (been around 20+ years). I wonder
> if there are some timeouts or something similar which drop connectivity. I
> also wonder whether repeated launching of srun subtasks might be doing
> something beyond what is normally expected.
>
> --
> Emeritus Professor Laurence Marks (Laurie)
> Northwestern University
> Webpage  and Google Scholar link
> 
> "Research is to see what everybody else has seen, and to think what nobody
> else has thought", Albert Szent-Györgyi
>

Re: [slurm-users] powersave: excluding nodes

2023-12-18 Thread Davide DelVento

Could it be a similar problem as the one in the "SlurmcltdHost confusion"
thread which (if you weren’t paying attention) the comma-separated syntax
doesn’t work, but repeating the thing on multiple lines achieved what that
person intended?
On the other hand, for AccountingStoreFlags the other thing was true: I had
two lines one specifying job_script and the other job_comment and only the
last one was honored until I noticed and consolidated them in one line,
comma-separating the arguments...

On Mon, Dec 11, 2023 at 9:52 AM Davide DelVento 
wrote:

> Forgot to mention: this is with slurm 23.02.6 (apologize for the double
> message)
>
> On Mon, Dec 11, 2023 at 9:49 AM Davide DelVento 
> wrote:
>
>> Following the example from https://slurm.schedmd.com/power_save.html
>> regarding SuspendExcNodes
>>
>> I configured my slurm.conf with
>>
>> SuspendExcNodes=node[01-12]:2,node[13-32]:2,node[33-34]:1,nodegpu[01-02]:1
>>
>> SuspendExcStates=down,drain,fail,maint,not_responding,reserved
>> #SuspendExcParts=
>>
>> (the nodes in the different groups have different amounts of physical
>> memory).
>>
>> Unfortunately, it seems to me that slurm does not honor such a setting
>> and excludes only the two nodes from one group, but shuts off everything
>> else. Is there another setting which may inadvertently cause this problem,
>> or that's a known bug?
>>
>> Thanks!
>>
>

Re: [slurm-users] [External] Re: Troubleshooting job stuck in Pending state

2023-12-12 Thread Davide DelVento

I am not a Slurm expert by any stretch of the imagination, so my answer is
not authoritative.

That said, I am not aware of any functional equivalent for Slurm, and I
would love to learn that I am mistaken!

On Tue, Dec 12, 2023 at 1:39 AM Pacey, Mike  wrote:

> Hi Davide,
>
>
>
> The jobs do eventually run, but can take several minutes or sometimes
> several hours to switch to a running state even when there’s plenty of
> resources free immediately.
>
>
>
> With Grid Engine it was possible to turn on scheduling diagnostics and get
> a summary of the scheduler’s decisions on a pending job by running “qstat
> -j jobid”. But there doesn’t seem to be any functional equivalent with
> SLURM?
>
>
>
> Regards,
>
> Mike
>
>
>
>
>
> *From:* slurm-users  *On Behalf Of
> *Davide DelVento
> *Sent:* Monday, December 11, 2023 4:23 PM
> *To:* Slurm User Community List 
> *Subject:* [External] Re: [slurm-users] Troubleshooting job stuck in
> Pending state
>
>
>
> *This email originated outside the University. Check before clicking links
> or attachments.*
>
> By getting "stuck" do you mean the job stays PENDING forever or does
> eventually run? I've seen the latter (and I agree with you that I wish
> Slurm will log things like "I looked at this job and I am not starting it
> yet because") but not the former
>
>
>
> On Fri, Dec 8, 2023 at 9:00 AM Pacey, Mike 
> wrote:
>
> Hi folks,
>
>
>
> I’m looking for some advice on how to troubleshoot jobs we occasionally
> see on our cluster that are stuck in a pending state despite sufficient
> matching resources being free. In the case I’m trying to troubleshoot the
> Reason field lists (Priority) but to find any way to get the scheduler to
> tell me what exactly is the priority job blocking.
>
>
>
>- I tried setting the scheduler log level to debug3 for 5 minutes at
>one point, but my logfile ballooned from 0.5G to 1.5G and didn’t offer any
>useful info for this case.
>- I’ve tried ‘scontrol schedloglevel 1’ but it returns the error:
>‘slurm_set_schedlog_level error: Requested operation is presently disabled’
>
>
>
> I’m aware that the backfill scheduler will occasionally hold on to free
> resources in order to schedule a larger job with higher priority, but in
> this case I can’t find any pending job that might fit the bill.
>
>
>
> And to possibly complicate matters, this is on a large partition that has
> no maximum time limit and most pending jobs have no time limits either. (We
> use backfill/fairshare as we have smaller partitions of rarer resources
> that benefit from it, plus we’re aiming to use fairshare even on the
> no-time-limits partitions to help balance out usage).
>
>
>
> Hoping someone can provide pointers.
>
>
>
> Regards,
>
> Mike
>
>

Re: [slurm-users] powersave: excluding nodes

2023-12-11 Thread Davide DelVento

Forgot to mention: this is with slurm 23.02.6 (apologize for the double
message)

On Mon, Dec 11, 2023 at 9:49 AM Davide DelVento 
wrote:

> Following the example from https://slurm.schedmd.com/power_save.html
> regarding SuspendExcNodes
>
> I configured my slurm.conf with
>
> SuspendExcNodes=node[01-12]:2,node[13-32]:2,node[33-34]:1,nodegpu[01-02]:1
> SuspendExcStates=down,drain,fail,maint,not_responding,reserved
> #SuspendExcParts=
>
> (the nodes in the different groups have different amounts of physical
> memory).
>
> Unfortunately, it seems to me that slurm does not honor such a setting and
> excludes only the two nodes from one group, but shuts off everything else.
> Is there another setting which may inadvertently cause this problem, or
> that's a known bug?
>
> Thanks!
>

Re: [slurm-users] Slurm powersave

2023-12-11 Thread Davide DelVento

In case it's useful to others: I've been able to get this working by having
the "no action" script stop the slurmd daemon and start it *with the -b
option*.

On Fri, Oct 6, 2023 at 4:28 AM Ole Holm Nielsen 
wrote:

> Hi Davide,
>
> On 10/5/23 15:28, Davide DelVento wrote:
> > IMHO, "pretending" to power down nodes defies the logic of the Slurm
> > power_save plugin.
> >
> > And it is sure useless ;)
> > But I was using the suggestion from
> > https://slurm.schedmd.com/power_save.html
> > <https://slurm.schedmd.com/power_save.html> which says
> >
> > You can also configure Slurm with programs that perform no action as
> > *SuspendProgram* and *ResumeProgram* to assess the potential impact of
> > power saving mode before enabling it.
>
> I had not noticed the above sentence in the power_save manual before!  So
> I decided to test a "no action" power saving script, similar to what you
> have done, applying it to a test partition.  I conclude that "no action"
> power saving DOES NOT WORK, at least in Slurm 23.02.5.  So I opened a bug
> report https://bugs.schedmd.com/show_bug.cgi?id=17848 to find out if the
> documentation is obsolete, or if there may be a bug.  Please follow that
> bug to find out the answer from SchedMD.
>
> What I *believe* (but not with 100% certainty) really happens with power
> saving in the current Slurm versions is what I wrote yesterday:
>
> > Slurmctld expects suspended nodes to *really* power
> > down (slurmd is stopped).  When slurmctld resumes a suspended node,
> it
> > expects slurmd to start up when the node is powered on.  There is a
> > ResumeTimeout parameter which I've set to about 15-30 minutes in
> case of
> > delays due to BIOS updates and the like - the default of 60 seconds
> is
> > WAY too small!
>
> I hope this helps,
> Ole
>
>

[slurm-users] powersave: excluding nodes

2023-12-11 Thread Davide DelVento

Following the example from https://slurm.schedmd.com/power_save.html
regarding SuspendExcNodes

I configured my slurm.conf with

SuspendExcNodes=node[01-12]:2,node[13-32]:2,node[33-34]:1,nodegpu[01-02]:1
SuspendExcStates=down,drain,fail,maint,not_responding,reserved
#SuspendExcParts=

(the nodes in the different groups have different amounts of physical
memory).

Unfortunately, it seems to me that slurm does not honor such a setting and
excludes only the two nodes from one group, but shuts off everything else.
Is there another setting which may inadvertently cause this problem, or
that's a known bug?

Thanks!

Re: [slurm-users] Troubleshooting job stuck in Pending state

2023-12-11 Thread Davide DelVento

By getting "stuck" do you mean the job stays PENDING forever or does
eventually run? I've seen the latter (and I agree with you that I wish
Slurm will log things like "I looked at this job and I am not starting it
yet because") but not the former

On Fri, Dec 8, 2023 at 9:00 AM Pacey, Mike  wrote:

> Hi folks,
>
>
>
> I’m looking for some advice on how to troubleshoot jobs we occasionally
> see on our cluster that are stuck in a pending state despite sufficient
> matching resources being free. In the case I’m trying to troubleshoot the
> Reason field lists (Priority) but to find any way to get the scheduler to
> tell me what exactly is the priority job blocking.
>
>
>
>- I tried setting the scheduler log level to debug3 for 5 minutes at
>one point, but my logfile ballooned from 0.5G to 1.5G and didn’t offer any
>useful info for this case.
>- I’ve tried ‘scontrol schedloglevel 1’ but it returns the error:
>‘slurm_set_schedlog_level error: Requested operation is presently disabled’
>
>
>
> I’m aware that the backfill scheduler will occasionally hold on to free
> resources in order to schedule a larger job with higher priority, but in
> this case I can’t find any pending job that might fit the bill.
>
>
>
> And to possibly complicate matters, this is on a large partition that has
> no maximum time limit and most pending jobs have no time limits either. (We
> use backfill/fairshare as we have smaller partitions of rarer resources
> that benefit from it, plus we’re aiming to use fairshare even on the
> no-time-limits partitions to help balance out usage).
>
>
>
> Hoping someone can provide pointers.
>
>
>
> Regards,
>
> Mike
>

Re: [slurm-users] Disabling SWAP space will it effect SLURM working

2023-12-11 Thread Davide DelVento

A little late here, but yes everything Hans said is correct and if you are
worried about slurm (or other critical system software) getting killed by
OOM, you can workaround it by properly configuring cgroup.

On Wed, Dec 6, 2023 at 2:06 AM Hans van Schoot  wrote:

> Hi Joseph,
>
> This might depend on the rest of your configuration, but in general swap
> should not be needed for anything on Linux.
> BUT: you might get OOM killer messages in your system logs, and SLURM
> might fall victim to the OOM killer (OOM = Out Of Memory) if you run
> applications on the compute node that eat up all your RAM.
> Swap does not prevent against this, but makes it less likely to happen.
> I've seen OOM kill slurm daemon processes on compute nodes with swap,
> usually slurm recovers just fine after the application that ate up all the
> RAM ends up getting killed by the OOM killer. My compute nodes are not
> configured to monitor memory usage of jobs. If you have memory configured
> as a managed resource in your SLURM setup, and you leave a bit of headroom
> for the OS itself (e.g. only hand our a maximum of 250GB RAM to jobs on
> your 256GB RAM nodes), you should be fine.
>
> cheers,
> Hans
>
>
> ps. I'm just a happy slurm user/admin, not an expert, so I might be wrong
> about everything :-)
>
>
>
> On 06-12-2023 05:57, John Joseph wrote:
>
> Dear All,
> Good morning
> We have 4 node   [256 GB Ram in each node]  SLURM instance  with which we
> installed and it is working fine.
> We have 2 GB of SWAP space on each node,  for some purpose  to make the
> system in full use want to disable the SWAP memory,
>
> Like to know if I am disabling the SWAP  partition will it efffect SLURM
> functionality .
>
> Advice requested
> Thanks
> Joseph John
>
>
>

Re: [slurm-users] slurm power save question

2023-11-29 Thread Davide DelVento

Thanks and no worries for the time it took to reply.

Sounds good then, and it's consistent with what the documentation says,
namely "prevent those nodes from being powered down". As you said "keep
that number of nodes up" is a different thing, and yes, it would be nice to
have.
For that purpose, I'm looking at my logs of workload and mulling if I
should make a cron job (submitting dummy slurm jobs) to force slurm
bringing nodes up if not enough idle ones are up, to reduce wait in queue
for users jobs.

Thanks again

On Wed, Nov 29, 2023 at 8:43 AM Brian Andrus  wrote:

> Sorry for the late reply.
>
> For my site, I used the optional ":" separator to ensure at least 4 nodes
> were up. Eg: nid[10-20]:4
> This means at least 4 nodes.. those nodes do not have to be the same 4 at
> any time, so if one is down that used to be idle, but 4 are up, that 1 will
> not be brought back up. I don't see this setting having much of anything to
> do with bringing nodes up at all with the exception of when you first start
> slurmctld and the settings are not met. Once there are jobs running on any
> of the listed nodes, they count toward the number. That is my experience
> with the small numbers I used. YMMV.
>
> I have also explicitly stated nodes without the separator, which does
> work. I do that when I am trying to look at a node that is idle without a
> job on it. That stops slurm from shutting it down while I am looking at it.
>
> Although, I do agree, the functionality of being able to have "keep at
> least X nodes up and idle" would be nice, that is not how I see this
> documented or working.
>
> Brian Andrus
> On 11/23/2023 5:12 AM, Davide DelVento wrote:
>
> Thanks for confirming, Brian. That was my understanding as well. Do you
> have it working that way on a machine you have access to?  If so, I'd be
> interested to see the config file, because that's not the behavior I am
> experiencing in my tests.
> In fact, in my tests Slurm will not bring down those "X nodes" but will
> not bring them up either, *unless* there is a job targeted to those. I may
> have something misconfigured, and I'd love to fix that.
>
> Thanks!
>
> On Wed, Nov 22, 2023 at 5:46 PM Brian Andrus  wrote:
>
>> As I understand it, that setting means "Always have at least X nodes up",
>> which includes running jobs. So it stops any wait time for the first X jobs
>> being submitted, but any jobs after that will need to wait for the power_up
>> sequence.
>>
>> Brian Andrus
>> On 11/22/2023 6:58 AM, Davide DelVento wrote:
>>
>> I've started playing with powersave and have a question about
>> SuspendExcNodes. The documentation at
>> https://slurm.schedmd.com/power_save.html says
>>
>> For example nid[10-20]:4 will prevent 4 usable nodes (i.e IDLE and not
>> DOWN, DRAINING or already powered down) in the set nid[10-20] from being
>> powered down.
>>
>> I initially interpreted that as "Slurm will try to keep 4 nodes idle on
>> as much as possible", which would have reduced the wait time for new jobs
>> targeting those nodes. Instead, it appears to mean "Slurm will not shut off
>> the last 4 nodes which are idle in that partition, however it will not turn
>> on nodes which it shut off earlier unless jobs are scheduled on them"
>>
>> Most notably if the 4 idle nodes will be allocated to other jobs (and so
>> they are no idle anymore) slurm does not turn on any nodes which have been
>> shut off earlier, so it's possible (and depending on workloads perhaps even
>> common) to have no idle nodes on regardless of the SuspendExcNode settings.
>>
>> Is that how it works, or do I have anything else in my setting which is
>> causing this unexpected-to-me behavior? I think I can live with it, but
>> IMHO it would have been better if slurm attempted to turn on nodes
>> preemptively trying to match the requested SuspendExcNodes, rather than
>> waiting for job submissions.
>>
>> Thanks and Happy Thanksgiving to people in the USA
>>
>>

Re: [slurm-users] slurm power save question

2023-11-23 Thread Davide DelVento

Thanks for confirming, Brian. That was my understanding as well. Do you
have it working that way on a machine you have access to?  If so, I'd be
interested to see the config file, because that's not the behavior I am
experiencing in my tests.
In fact, in my tests Slurm will not bring down those "X nodes" but will not
bring them up either, *unless* there is a job targeted to those. I may have
something misconfigured, and I'd love to fix that.

Thanks!

On Wed, Nov 22, 2023 at 5:46 PM Brian Andrus  wrote:

> As I understand it, that setting means "Always have at least X nodes up",
> which includes running jobs. So it stops any wait time for the first X jobs
> being submitted, but any jobs after that will need to wait for the power_up
> sequence.
>
> Brian Andrus
> On 11/22/2023 6:58 AM, Davide DelVento wrote:
>
> I've started playing with powersave and have a question about
> SuspendExcNodes. The documentation at
> https://slurm.schedmd.com/power_save.html says
>
> For example nid[10-20]:4 will prevent 4 usable nodes (i.e IDLE and not
> DOWN, DRAINING or already powered down) in the set nid[10-20] from being
> powered down.
>
> I initially interpreted that as "Slurm will try to keep 4 nodes idle on as
> much as possible", which would have reduced the wait time for new jobs
> targeting those nodes. Instead, it appears to mean "Slurm will not shut off
> the last 4 nodes which are idle in that partition, however it will not turn
> on nodes which it shut off earlier unless jobs are scheduled on them"
>
> Most notably if the 4 idle nodes will be allocated to other jobs (and so
> they are no idle anymore) slurm does not turn on any nodes which have been
> shut off earlier, so it's possible (and depending on workloads perhaps even
> common) to have no idle nodes on regardless of the SuspendExcNode settings.
>
> Is that how it works, or do I have anything else in my setting which is
> causing this unexpected-to-me behavior? I think I can live with it, but
> IMHO it would have been better if slurm attempted to turn on nodes
> preemptively trying to match the requested SuspendExcNodes, rather than
> waiting for job submissions.
>
> Thanks and Happy Thanksgiving to people in the USA
>
>

Re: [slurm-users] Dynamic MIG Question

2023-11-22 Thread Davide DelVento

I assume you mean the sentence about dynamic MIG at
https://slurm.schedmd.com/gres.html#MIG_Management
Could it be supported? I think so, but only if one of their paying
customers (that could be you) asks for it.

On Wed, Nov 22, 2023 at 11:24 AM Aaron Kollmann <
aaron.kollm...@student.hpi.de> wrote:

> Hello All,
>
> I am currently working in a research project and we are trying to find out
> whether we can use NVIDIAs multi-instance GPU (MIG) dynamically in SLURM.
>
> For instance:
>
> - a user requests a job and wants a GPU but none is available
>
> - now SLURM will reconfigure a MIG GPU to create a partition (e.g. 1g.5gb)
> which becomes available and allocated immediately
>
> I can already reconfigure MIG + SLURM within a few seconds to start jobs
> on newly partitioned resources, but Jobs get killed when I restart slurmd
> on nodes with a changed MIG config. (see script example below)
>
> *Do you think it is possible to develop a plugin or change SLURM to the
> extent that dynamic MIG will be supported one day? *
>
> (The website says it is not supported)
>
>
>
> Best
>
> - Aaron
>
>
>
>
> #!/usr/bin/bash
>
> # Generate Start Config
> killall slurmd
> killall slurmctld
> nvidia-smi mig -dci
> nvidia-smi mig -dgi
> nvidia-smi mig -cgi 19,14,5 -i 0 -C
> nvidia-smi mig -cgi 0 -i 1 -C
> cp -f ./slurm-19145-0.conf /etc/slurm/slurm.conf
> slurmd -c
> slurmctld -c
> sleep 5
>
> # Start a running and a pending job (the first job gets killed by slurm)
> srun -w gx06 -c 2 --mem 1G --gres=gpu:a100_1g.5gb:1 sleep 300 &
> srun -w gx06 -c 2 --mem 1G --gres=gpu:a100_1g.5gb:1 sleep 300 &
> sleep 5
>
> # Simulate MIG Config Change
> nvidia-smi mig -i 1 -dci
> nvidia-smi mig -i 1 -dgi
> nvidia-smi mig -cgi 19,14,5 -i 1 -C
> cp -f ./slurm-2x19145.conf /etc/slurm/slurm.conf
> killall slurmd
> killall slurmctld
> slurmd
> slurmctld
>

[slurm-users] slurm power save question

2023-11-22 Thread Davide DelVento

I've started playing with powersave and have a question about
SuspendExcNodes. The documentation at
https://slurm.schedmd.com/power_save.html says

For example nid[10-20]:4 will prevent 4 usable nodes (i.e IDLE and not
DOWN, DRAINING or already powered down) in the set nid[10-20] from being
powered down.

I initially interpreted that as "Slurm will try to keep 4 nodes idle on as
much as possible", which would have reduced the wait time for new jobs
targeting those nodes. Instead, it appears to mean "Slurm will not shut off
the last 4 nodes which are idle in that partition, however it will not turn
on nodes which it shut off earlier unless jobs are scheduled on them"

Most notably if the 4 idle nodes will be allocated to other jobs (and so
they are no idle anymore) slurm does not turn on any nodes which have been
shut off earlier, so it's possible (and depending on workloads perhaps even
common) to have no idle nodes on regardless of the SuspendExcNode settings.

Is that how it works, or do I have anything else in my setting which is
causing this unexpected-to-me behavior? I think I can live with it, but
IMHO it would have been better if slurm attempted to turn on nodes
preemptively trying to match the requested SuspendExcNodes, rather than
waiting for job submissions.

Thanks and Happy Thanksgiving to people in the USA

Re: [slurm-users] SLURM new user query, does SLURM has GUI /Web based management version also

2023-11-20 Thread Davide DelVento

Not sure if that's what you are looking for, Joseph, but I believe
ClusterVisor and Bright do provide some basic Slurm management as a web GUI.
I don't think either is available outside of the support for hw purchased
from the respective vendors.
See e.g. https://www.advancedclustering.com/products/software/clustervisor/

On Sun, Nov 19, 2023 at 3:52 AM Ole Holm Nielsen 
wrote:

> On 19-11-2023 09:11, Joseph John wrote:
> > I am new user, trying out SLURM
> >
> > Like to check if the SLURM has a GUI/web based management tool also
>
> Did you read the Quick Start Administrator Guide at
> https://slurm.schedmd.com/quickstart_admin.html ?
>
> I don't believe there are any Slurm management tools as a web GUI, and
> that would probably be a security nightmare anyway because privileged
> system access is required.
>
> There are a number of monitoring tools for viewing the status of Slurm
> jobs.
>
> /Ole
>
>

Re: [slurm-users] job_desc.pn-min-memory in LUA jobsubmit-plugin

2023-11-17 Thread Davide DelVento

I don't have an answer for you, but I found your message in my spam folder.
I brought it out and I'm replying to it in the hope that it gets some
visibility in people's mailboxes.

Note that in the US it's SC week and many people are or have been busy with
it and will be travelling in the next days, then next week is Thanksgiving
and some people take the week off, so you may still have to wait for a week
or so to get an answer -- unless you get from somewhere else in the world,
that is ;-)

On Thu, Nov 16, 2023 at 12:54 AM Marx, Wolfgang <
wolfgang.m...@tu-darmstadt.de> wrote:

> Hello,
>
>
>
> we are using Slurm Version 23.02.3 and are working on a job_submit-plugin
> written in LUA.
>
>
>
> During the development of the script we found out, that values we give for
>
> --mem will appear in the job-submit-plugin in the variable
> job_desc.pn-min-memory
>
> and for
>
> --mem-per-cpu will appear in the variable job_desc.min_mem_per_cpu
>
>
>
> During our tests we now see a strange behavior:
>
> When we start a job without --mem or --mem-per-cpu ,
> job_desc.pn-min-memory shows up with the value  1.844674407371e+19
>
> When we start a job without --mem but --mem-per-cpu is set,
> job_desc.pn-min-meory shows up with the value 9.2233720368548e+18
>
>
>
> Why does the NO_VAL value for --mem differs depending if --mem-per-cpu is
> set or not ?
>
>
>
> In the documentation i could not find a proper explanation.
>
>
>
> Kind regards
>
> Wolfgang
>
>
>
> Wolfgang Marx, Basisdienste, Gruppe Hochleistungrechnen
>
> Technische Universität Darmstadt, Hochschulrechenzentrum
>
> Alexanderstraße 2, 64283 Darmstadt
>
> Tel.: +496151/16-71158
>
> E-Mail: wolfgang.m...@tu-darmstadt.de
>
> Web: www.hrz.tu-darmstadt.de
>
>
>

Re: [slurm-users] REST-based CLI tools out there somewhere?

2023-11-10 Thread Davide DelVento

>
> Having a large number of researchers able to run arbitrary code on the
> same submit host has a marked tendency to result in an overloaded host.
> There are various ways to regulate that ranging from "constant scolding" to
> "aggressive quotas/cgroups/etc", but all involve some degree of
> inconvenience for all concerned.   So the desire is to do the same things
> they are currently doing, but on a node they do not have to share.
>

If you have enough resources this could be a node managed by slurm, and you
can use allocations to make sure people play nice.


> For example, user X has a framework that consumes data from various
> sources, crunches it in Slurm by executing s* commands, and spits out
> reports to a NAS share.   The framework itself is long-running and
> interactive, so they prefer to keep it out of Slurm; however it is also
> quite heavy, and thus a poor fit for a shared system.  This can be
> addressed in many ways, but the lowest-effort route (from user X's point of
> view) would be to simply run the existing framework somewhere else so they
> do not need to share.
>

Why not a dedicated node on your cluster?

Open OnDemand works really great for this use case! Give it a try, there is
a nice demo install which you can use for testing it (no install required
on your side): https://openondemand.org/run-open-ondemand

If that does not work for you, Jared's (*) suggestion of wrapping slurm
commands in ssh scripts or the likes sounds like your best bet.

(*): Hi Jared, long time... hope you're doing well.

Re: [slurm-users] REST-based CLI tools out there somewhere?

2023-11-09 Thread Davide DelVento

Not a direct answer to your question, but have you looked at Open OnDemand?
Or maybe JupyterHub?
I think most places today prefer to do either of those which provide
somewhat the functionality you asked - and much more.

On Thu, Nov 9, 2023 at 4:17 PM Chip Seraphine 
wrote:

> Hello,
>
> Our users submit their jobs from shared submit hosts, and have expressed
> an understandable preference for being able to submit directly from their
> own workstations.   The obvious solution (installing the slurm client on
> their workstations, or providing a container that does something similar)
> are not available to us because of security concerns.   This leaves REST as
> the best option.   We’re hoping to provide a REST-based toolset that users
> familiar with the command line tools can make immediate use of (so,
> provides basic, stripped-down functionality of srun, squeue, sacct, and
> sinfo).  Basically, we want to create a subset of the s* commands that can
> be run from some arbitrary machine if the user has the appropriate token.
>
> It’d be surprising if we were the first people to go down this path, but
> searching has turned up nothing.   Is there a project anyone knows about
> out there for providing command-line SLURM commands that use REST to talk
> to the daemons?   Or am I missing some obvious solution here?
>
> --
>
> Chip Seraphine
> Grid Operations
> For support please use help-grid in email or slack.
> This e-mail and any attachments may contain information that is
> confidential and proprietary and otherwise protected from disclosure. If
> you are not the intended recipient of this e-mail, do not read, duplicate
> or redistribute it by any means. Please immediately delete it and any
> attachments and notify the sender that you have received it by mistake.
> Unintended recipients are prohibited from taking action on the basis of
> information in this e-mail or any attachments. The DRW Companies make no
> representations that this e-mail or any attachments are free of computer
> viruses or other defects.
>

Re: [slurm-users] SLURM , maximum scalable instance is which one

2023-11-01 Thread Davide DelVento

Not sure if it's the largest, but LUMI is a very large one

https://www.top500.org/system/180048/
https://docs.lumi-supercomputer.eu/runjobs/scheduled-jobs/partitions/

On Sun, Oct 29, 2023 at 4:16 AM John Joseph  wrote:

> Dear All,
> Like to know that what is the maximum scalled up instance of SLURM so
> far.  From which web site I can get the information of the highest scalable
> instance of SLURM and other popular setup using SLURM
> Thanks
> Joseph John
>

Re: [slurm-users] Sinfo options not working in SLURM 23.11

2023-10-30 Thread Davide DelVento

>
> I am working on SLURM 23.11 version.
>

???

Latest version is slurm-23.02.6 which one are you referring to?
https://github.com/SchedMD/slurm/tags



>

Re: [slurm-users] Question about gdb sbatch

2023-10-23 Thread Davide DelVento

I think that should be sufficient.

On Sat, Oct 21, 2023 at 2:26 AM mohammed shambakey 
wrote:

> Thank you very much. I'm compiling from the source code.
>
> Another question please about the proper location for the "-g" option,
> because I see options like CFLAGS="-g" in slurm/configure file, and I
> wonder if I should add the "-g" to  some other locations?
>
> Regards
>
> On Sat, Oct 21, 2023 at 12:47 AM Davide DelVento 
> wrote:
>
>> Have you compiled slurm yourself or have you installed binaries? If the
>> latter, I speculate this is not possible, in that it would not have been
>> compiled with the required symbols (above all "-g" but probably others
>> depending on your platform).
>>
>> If you compiled slurm yourself, and assuming you have included all the
>> necessary symbols (or will re-compile appropriately and replace the
>> binaries and libraries), then it'd be like debugging any other thing: just
>> make sure to point gdb at the location of the source code, and then follow
>> any of the gazillion tutorials around about gdb. If you are not familiar
>> with gdb already, I strongly recommend that you start with some simpler
>> program before attempting something as big as slurm.
>>
>> Have a great weekend
>>
>> On Fri, Oct 20, 2023 at 9:30 AM mohammed shambakey 
>> wrote:
>>
>>> Hi
>>>
>>> Is it possible to debug "sbatch" itself when submitting a script? For
>>> example, I want to debug the following command:
>>> sbatch -Mall some_script.sh
>>>
>>> I don't want to debug the "some_script.sh". I want to debug the "sbatch"
>>> itself when submitting the "some_script.sh". I tried to use "gdb" but I'm
>>> not an expert, and it didn't work for me. I wonder if anyone did something
>>> like this? and how?
>>>
>>> Regards
>>>
>>>
>>> --
>>> Mohammed
>>>
>>
>
> --
> Mohammed
>

Re: [slurm-users] Question about gdb sbatch

2023-10-20 Thread Davide DelVento

Have you compiled slurm yourself or have you installed binaries? If the
latter, I speculate this is not possible, in that it would not have been
compiled with the required symbols (above all "-g" but probably others
depending on your platform).

If you compiled slurm yourself, and assuming you have included all the
necessary symbols (or will re-compile appropriately and replace the
binaries and libraries), then it'd be like debugging any other thing: just
make sure to point gdb at the location of the source code, and then follow
any of the gazillion tutorials around about gdb. If you are not familiar
with gdb already, I strongly recommend that you start with some simpler
program before attempting something as big as slurm.

Have a great weekend

On Fri, Oct 20, 2023 at 9:30 AM mohammed shambakey 
wrote:

> Hi
>
> Is it possible to debug "sbatch" itself when submitting a script? For
> example, I want to debug the following command:
> sbatch -Mall some_script.sh
>
> I don't want to debug the "some_script.sh". I want to debug the "sbatch"
> itself when submitting the "some_script.sh". I tried to use "gdb" but I'm
> not an expert, and it didn't work for me. I wonder if anyone did something
> like this? and how?
>
> Regards
>
>
> --
> Mohammed
>

Re: [slurm-users] Correct way to do logrotation

2023-10-17 Thread Davide DelVento

I'd be interested in this too, and I'm reposting only because the message
was flagged as both "dangerous email" and "spam", so people may not have
seen it (hopefully my reply will not suffer the same downfall...)

On Mon, Oct 16, 2023 at 3:26 AM Taras Shapovalov 
wrote:

> Hello,
>
> In the past it was recommended to reconfigure slurm daemons in logrotate
> script, sending a signal I believe was also the way to go. But recently I
> retested manual logrotation and I see that a removal of log file (for
> slurmctld, slurmdbd or slurmd) does not affect the logging of the daemons.
> The dameons just recreate the log files and continue to write logs there.
> What is the right way to go in case of the modern Slurm versions?
>
> Best regards,
>
> Taras
>

Re: [slurm-users] hostlist members when calling resume and suspend with powersave

2023-10-06 Thread Davide DelVento

I don't think there is such a guarantee and in fact my reading of
https://slurm.schedmd.com/power_save.html#images means that most likely the
nodes can and will be mingled together and your script should untangle that.

But as you probably guessed from my other message, I'm new to powersave in
slurm, definitely not an authoritative source.

On Fri, Oct 6, 2023 at 10:52 AM Marshall, John (SSC/SPC) <
john.marsh...@ssc-spc.gc.ca> wrote:

> Hi,
>
> When using powersaving and the cloud, do the hosts provided to the
> registered resume and suspend scripts all belong to the same job? IOW, for
> a particular invocation of the resume and suspend scripts, do the host
> lists contain hosts for more than one job?
>
> Thanks,
> John
>

Re: [slurm-users] Slurm powersave

2023-10-05 Thread Davide DelVento

Hi Ole,

Thanks for getting back to me.

> the great presentation
> > from our own
> I presented that talk at SLUG'23 :-)
>

Yes! That's why I wrote "from our own", but perhaps these are local slangs
where I live (and English is my second language)


> > 1) I'm not sure I fully understand ReconfigFlags=KeepPowerSaveSettings

As I understand it, the ReconfigFlags means that if you updated some
> settings using scontrol, they will be lost when slurmctld is reconfigured,
> and the settings from slurm.conf will be used in stead.
>

I see, so that applies to the case in which I change the (power) state of
the nodes by scontrol.


>
> > 2) the PDF above says that the problem with nodes in down and drained
> > state is solved in 23.02 but that does not appear to be the case. Before
> > running my experiment, I had
> >
> > $ sinfo -R
> > REASON   USER  TIMESTAMP   NODELIST
> > Not responding   root  2023-09-13T13:14:50 node31
> > ECC memory errorsroot  2023-08-26T07:21:04 node27
> >
> > and after it became
> >
> > $ sinfo -R
> > REASON   USER  TIMESTAMP   NODELIST
> > Not responding   root  2023-09-13T13:14:50 node31
> > none Unknown   Unknown node27
>
> Please use "sinfo -lR" so that we can see the node STATE.
>

$ sinfo -lR
Thu Oct 05 07:08:18 2023
REASON   USER TIMESTAMP   STATE  NODELIST
Not responding   root(0)  2023-09-13T13:14:50 down~  node31
none root(0)  Unknown drain  node27

Somewhat it has now remembered that the user was root (it now shows that
even with plain sinfo -R)

> so probably that's not solved? Anyway, that's a nuisance, not a deal
> breaker
>
> With my 23.02.5 the SuspendExcStates is working as documented :-)
>

Okay so perhaps something happened between 23.02.3 and 23.02.5. I might
need to sleuth in the ticketing system.



> > 3) The whole thing does not appear to be working as I intended. My
> > understanding of the "exclude node" above should have meant that slurm
> > should never attempt to shut off more than all idle nodes in that
> > partition minus 2. Instead it shut them off all of them, and then tried
> to
> > turn them back on:
> >
> > $ sinfo | grep 512
> > compute512 up   infinite  1 alloc# node15
> > compute512 up   infinite  2  idle# node[14,32]
> > compute512 up   infinite  3  down~ node[16-17,31]
> > compute512 up   infinite  1 drain~ node27
> > compute512 up   infinite 12  idle~ node[18-26,28-30]
> > compute512 up   infinite  1  alloc node13
>
> I agree that 2 nodes from node[13-32] shouldn't be suspended, according to
> SuspendExcNodes in the slurm.conf manual.  I haven't tested this feature.
>

Good to know that an independent read of the manual is understood the same
way as mine. If you don't use this feature, what do you do? Shutting off
all idle nodes and leaving newly submitted jobs waiting for a boot? Or
something else?



> > But again this is a minor nuisance which I can live with


> 4) Most importantly from the output above you may have noticed two nodes
> > (actually three by the time I ran the command below) that slurm deemed
> down
> >
> > So I can confirm slurm invoked the script, but then waited for something
> > (what? starting slurmd?) which failed to occur and marked the node
> down.
> > When I removed the suspend time from the partition to end the
> experiment,
> > the other nodes went "magically" in production , without slurm calling
> my
> > poweron script. Of course the nodes were never powered off, but slurm
> > thought they were, so why it did not have the problem it id with the
> node
> > which instead intentionally tried to power on?
>
> IMHO, "pretending" to power down nodes defies the logic of the Slurm
> power_save plugin.


And it is sure useless ;)
But I was using the suggestion from
https://slurm.schedmd.com/power_save.html which says

You can also configure Slurm with programs that perform no action as
*SuspendProgram* and *ResumeProgram* to assess the potential impact of
power saving mode before enabling it.



> Slurmctld expects suspended nodes to *really* power
> down (slurmd is stopped).  When slurmctld resumes a suspended node, it
> expects slurmd to start up when the node is powered on.  There is a
> ResumeTimeout parameter which I've set to about 15-30 minutes in case of
> delays due to BIOS updates and the like - the default of 60 seconds is WAY
> too small!
>

Sure in fact I upped that to 4 minutes. Typically our nodes reboot in 3
minutes and will not update BIOS or OS automatically. Sometimes they become
"hosed" and slower (firmware bug throttling CPU speed for no reason) but in
that case better Slurm recognizes it and deems the node down. But in any
case this is a moot point since the node is not going down


> Have you tried to experiment with the IPMI based power down/up method
> explained in the above

Re: [slurm-users] enabling job script archival

2023-10-05 Thread Davide DelVento

Okay, so perhaps this is another bug. At each reconfigure, users lose
access to the jobs they submitted before the reconfigure itself and start
"clean slate". Newly submitted jobs can be queried normally. The slurm
administrator can query everything at all times, so the data is not
lost, but this is really unfortunate

Has anybody experienced this issue or can try querying some of their old
jobs which were completed before a reconfigure and confirm if this is
happening for them too?
Anybody knows this being already a bug and/or suggest if I should submit it?

Thanks!

On Wed, Oct 4, 2023 at 7:47 PM Davide DelVento 
wrote:

> And weirdly enough it has now stopped working again, after I did the
> experimentation for power save described in the other thread.
> That is really strange. At the highest verbosity level the logs just say
>
> slurmdbd: debug:  REQUEST_PERSIST_INIT: CLUSTER:cluster VERSION:9984
> UID:1457 IP:192.168.2.254 CONN:13
>
> I reconfigured and reverted stuff to no change. Does anybody have any clue?
>
> On Tue, Oct 3, 2023 at 5:43 PM Davide DelVento 
> wrote:
>
>> For others potentially seeing this on mailing list search, yes, I needed
>> that, which of course required creating an account charge which I wasn't
>> using. So I ran
>>
>> sacctmgr add account default_account
>> sacctmgr add -i user $user Accounts=default_account
>>
>> with an appropriate looping around for $user and everything is working
>> fine now.
>>
>> Thanks everybody!
>>
>> On Tue, Oct 3, 2023 at 7:44 AM Paul Edmon  wrote:
>>
>>> You will probably need to.
>>>
>>> The way we handle it is that we add users when the first submit a job
>>> via the job_submit.lua script. This way the database autopopulates with
>>> active users.
>>>
>>> -Paul Edmon-
>>> On 10/3/23 9:01 AM, Davide DelVento wrote:
>>>
>>> By increasing the slurmdbd verbosity level, I got additional
>>> information, namely the following:
>>>
>>> slurmdbd: error: couldn't get information for this user (null)(xx)
>>> slurmdbd: debug: accounting_storage/as_mysql:
>>> as_mysql_jobacct_process_get_jobs: User  xx  has no associations, and
>>> is not admin, so not returning any jobs.
>>>
>>> again where x is the posix ID of the user who's running the query in
>>> the slurmdbd logs.
>>>
>>> I suspect this is due to the fact that our userbase is small enough (we
>>> are a department HPC) that we don't need to use allocation and the like, so
>>> I have not configured any association (and not even studied its
>>> configuration, since when I was at another place which did use
>>> associations, someone else took care of slurm administration).
>>>
>>> Anyway, I read the fantastic document by our own member at
>>> https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_accounting/#associations
>>> and in fact I have not even configured slurm users:
>>>
>>> # sacctmgr show user
>>>   User   Def Acct Admin
>>> -- -- -
>>>   root   root Administ+
>>> #
>>>
>>> So is that the issue? Should I just add all users? Any suggestions on
>>> the minimal (but robust) way to do that?
>>>
>>> Thanks!
>>>
>>>
>>> On Mon, Oct 2, 2023 at 9:20 AM Davide DelVento 
>>> wrote:
>>>
>>>> Thanks Paul, this helps.
>>>>
>>>> I don't have any PrivateData line in either config file. According to
>>>> the docs, "By default, all information is visible to all users" so this
>>>> should not be an issue. I tried to add a line with "PrivateData=jobs" to
>>>> the conf files, just in case, but that didn't change the behavior.
>>>>
>>>> On Mon, Oct 2, 2023 at 9:10 AM Paul Edmon 
>>>> wrote:
>>>>
>>>>> At least in our setup, users can see their own scripts by doing sacct
>>>>> -B -j JOBID
>>>>>
>>>>> I would make sure that the scripts are being stored and how you have
>>>>> PrivateData set.
>>>>>
>>>>> -Paul Edmon-
>>>>> On 10/2/2023 10:57 AM, Davide DelVento wrote:
>>>>>
>>>>> I deployed the job_script archival and it is working, however it can
>>>>> be queried only by root.
>>>>>
>>>>> A regular user can run sacct -lj towards any jobs (even those by other
>>>>> users, and that's okay in our set

Re: [slurm-users] enabling job script archival

2023-10-04 Thread Davide DelVento

And weirdly enough it has now stopped working again, after I did the
experimentation for power save described in the other thread.
That is really strange. At the highest verbosity level the logs just say

slurmdbd: debug:  REQUEST_PERSIST_INIT: CLUSTER:cluster VERSION:9984
UID:1457 IP:192.168.2.254 CONN:13

I reconfigured and reverted stuff to no change. Does anybody have any clue?

On Tue, Oct 3, 2023 at 5:43 PM Davide DelVento 
wrote:

> For others potentially seeing this on mailing list search, yes, I needed
> that, which of course required creating an account charge which I wasn't
> using. So I ran
>
> sacctmgr add account default_account
> sacctmgr add -i user $user Accounts=default_account
>
> with an appropriate looping around for $user and everything is working
> fine now.
>
> Thanks everybody!
>
> On Tue, Oct 3, 2023 at 7:44 AM Paul Edmon  wrote:
>
>> You will probably need to.
>>
>> The way we handle it is that we add users when the first submit a job via
>> the job_submit.lua script. This way the database autopopulates with active
>> users.
>>
>> -Paul Edmon-
>> On 10/3/23 9:01 AM, Davide DelVento wrote:
>>
>> By increasing the slurmdbd verbosity level, I got additional information,
>> namely the following:
>>
>> slurmdbd: error: couldn't get information for this user (null)(xx)
>> slurmdbd: debug: accounting_storage/as_mysql:
>> as_mysql_jobacct_process_get_jobs: User  xx  has no associations, and
>> is not admin, so not returning any jobs.
>>
>> again where x is the posix ID of the user who's running the query in
>> the slurmdbd logs.
>>
>> I suspect this is due to the fact that our userbase is small enough (we
>> are a department HPC) that we don't need to use allocation and the like, so
>> I have not configured any association (and not even studied its
>> configuration, since when I was at another place which did use
>> associations, someone else took care of slurm administration).
>>
>> Anyway, I read the fantastic document by our own member at
>> https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_accounting/#associations
>> and in fact I have not even configured slurm users:
>>
>> # sacctmgr show user
>>   User   Def Acct Admin
>> -- -- -
>>   root   root Administ+
>> #
>>
>> So is that the issue? Should I just add all users? Any suggestions on the
>> minimal (but robust) way to do that?
>>
>> Thanks!
>>
>>
>> On Mon, Oct 2, 2023 at 9:20 AM Davide DelVento 
>> wrote:
>>
>>> Thanks Paul, this helps.
>>>
>>> I don't have any PrivateData line in either config file. According to
>>> the docs, "By default, all information is visible to all users" so this
>>> should not be an issue. I tried to add a line with "PrivateData=jobs" to
>>> the conf files, just in case, but that didn't change the behavior.
>>>
>>> On Mon, Oct 2, 2023 at 9:10 AM Paul Edmon 
>>> wrote:
>>>
>>>> At least in our setup, users can see their own scripts by doing sacct
>>>> -B -j JOBID
>>>>
>>>> I would make sure that the scripts are being stored and how you have
>>>> PrivateData set.
>>>>
>>>> -Paul Edmon-
>>>> On 10/2/2023 10:57 AM, Davide DelVento wrote:
>>>>
>>>> I deployed the job_script archival and it is working, however it can be
>>>> queried only by root.
>>>>
>>>> A regular user can run sacct -lj towards any jobs (even those by other
>>>> users, and that's okay in our setup) with no problem. However if they run
>>>> sacct -j job_id --batch-script even against a job they own themselves,
>>>> nothing is returned and I get a
>>>>
>>>> slurmdbd: error: couldn't get information for this user (null)(xx)
>>>>
>>>> where x is the posix ID of the user who's running the query in the
>>>> slurmdbd logs.
>>>>
>>>> Both configure files slurmdbd.conf and slurm.conf do not have any
>>>> "permission" setting. FWIW, we use LDAP.
>>>>
>>>> Is that the expected behavior, in that by default only root can see the
>>>> job scripts? I was assuming the users themselves should be able to debug
>>>> their own jobs... Any hint on what could be changed to achieve this?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>> On Fri, Sep 29, 2023 at 5:48 AM Davide DelVento <
&

[slurm-users] Slurm powersave

2023-10-04 Thread Davide DelVento

I'm experimenting with slurm powersave and I have several questions. I'm
following the guidance from https://slurm.schedmd.com/power_save.html and
the great presentation from our own
https://slurm.schedmd.com/SLUG23/DTU-SLUG23.pdf

I am running slurm 23.02.3

1) I'm not sure I fully understand ReconfigFlags=KeepPowerSaveSettings
The documentations ways that if set, an "scontrol reconfig" command will
preserve the current state of SuspendExcNodes, SuspendExcParts and
SuspendExcStates. Why would one *NOT* want to preserve that? What would
happen if one does not (or does) have this setting? For now I'm using it,
assuming that it means "if I run scontrol reconfig" don't shut off nodes
that are up because I said so that they should be up in slurm.conf with
those three options" --- but I am not clear if that is really what it says.

2) the PDF above says that the problem with nodes in down and drained state
is solved in 23.02 but that does not appear to be the case. Before running
my experiment, I had

$ sinfo -R
REASON   USER  TIMESTAMP   NODELIST
Not responding   root  2023-09-13T13:14:50 node31
ECC memory errorsroot  2023-08-26T07:21:04 node27

and after it became

$ sinfo -R
REASON   USER  TIMESTAMP   NODELIST
Not responding   root  2023-09-13T13:14:50 node31
none Unknown   Unknown node27

And that despite having excluded drain'ed nodes as below:

--- a/slurm/slurm.conf
+++ b/slurm/slurm.conf
@@ -140,12 +140,15 @@ SlurmdLogFile=/var/log/slurm/slurmd.log
 #
 #
 # POWER SAVE SUPPORT FOR IDLE NODES (optional)
+SuspendProgram=/opt/slurm/poweroff
+ResumeProgram=/opt/slurm/poweron
+SuspendTimeout=120
+ResumeTimeout=240
 #ResumeRate=
+SuspendExcNodes=node[13-32]:2
+SuspendExcStates=down,drain,fail,maint,not_responding,reserved
+BatchStartTimeout=60
+ReconfigFlags=KeepPowerSaveSettings # not sure if needed: preserve current
status when running "scontrol reconfig"
-PartitionName=compute512 Default=False Nodes=node[13-32] State=UP
DefMemPerCPU=9196
+PartitionName=compute512 Default=False Nodes=node[13-32] State=UP
DefMemPerCPU=9196 SuspendTime=600

so probably that's not solved? Anyway, that's a nuisance, not a deal breaker

3) The whole thing does not appear to be working as I intended. My
understanding of the "exclude node" above should have meant that slurm
should never attempt to shut off more than all idle nodes in that partition
minus 2. Instead it shut them off all of them, and then tried to turn them
back on:

$ sinfo | grep 512
compute512 up   infinite  1 alloc# node15
compute512 up   infinite  2  idle# node[14,32]
compute512 up   infinite  3  down~ node[16-17,31]
compute512 up   infinite  1 drain~ node27
compute512 up   infinite 12  idle~ node[18-26,28-30]
compute512 up   infinite  1  alloc node13

But again this is a minor nuisance which I can live with (especially if it
happens only when I "flip the switch"), and I'm mentioning only in case
it's a symptom of something else I'm doing wrong. I did try to use both the
SuspendExcNodes=node[13-32]:2 syntax as it seem more reasonable to me
(compared to the rest of the file, e.g. partitions definition) and the
SuspendExcNodes=node[13\-32]:2 as suggested in the slurm powersave
documentation. Behavior, exactly identical

4) Most importantly from the output above you may have noticed two nodes
(actually three by the time I ran the command below) that slurm deemed down

$ sinfo -R
REASON   USER  TIMESTAMP   NODELIST
Not responding   root  2023-09-13T13:14:50 node31
reboot timed out slurm 2023-10-04T14:51:28 node14
reboot timed out slurm 2023-10-04T14:52:28 node15
reboot timed out slurm 2023-10-04T14:49:58 node32
none Unknown   Unknown node27

This can't be the case, the nodes are fine, and cannot have timed out while
"rebooting", because for now my poweroff and poweron script are identical
and literally a simple one-liner bash script doing almost nothing and the
log file is populated correctly as I would expect

echo "Pretending to $0 the following node(s): $1"  >> $log_file 2>&1

So I can confirm slurm invoked the script, but then waited for something
(what? starting slurmd?) which failed to occur and marked the node down.
When I removed the suspend time from the partition to end the experiment,
the other nodes went "magically" in production , without slurm calling my
poweron script. Of course the nodes were never powered off, but slurm
thought they were, so why it did not have the problem it id with the node
which instead intentionally tried to power on?

Thanks for any light you can shed on these issues, particularly the last
one!

Re: [slurm-users] enabling job script archival

2023-10-03 Thread Davide DelVento

For others potentially seeing this on mailing list search, yes, I needed
that, which of course required creating an account charge which I wasn't
using. So I ran

sacctmgr add account default_account
sacctmgr add -i user $user Accounts=default_account

with an appropriate looping around for $user and everything is working fine
now.

Thanks everybody!

On Tue, Oct 3, 2023 at 7:44 AM Paul Edmon  wrote:

> You will probably need to.
>
> The way we handle it is that we add users when the first submit a job via
> the job_submit.lua script. This way the database autopopulates with active
> users.
>
> -Paul Edmon-
> On 10/3/23 9:01 AM, Davide DelVento wrote:
>
> By increasing the slurmdbd verbosity level, I got additional information,
> namely the following:
>
> slurmdbd: error: couldn't get information for this user (null)(xx)
> slurmdbd: debug: accounting_storage/as_mysql:
> as_mysql_jobacct_process_get_jobs: User  xx  has no associations, and
> is not admin, so not returning any jobs.
>
> again where x is the posix ID of the user who's running the query in
> the slurmdbd logs.
>
> I suspect this is due to the fact that our userbase is small enough (we
> are a department HPC) that we don't need to use allocation and the like, so
> I have not configured any association (and not even studied its
> configuration, since when I was at another place which did use
> associations, someone else took care of slurm administration).
>
> Anyway, I read the fantastic document by our own member at
> https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_accounting/#associations
> and in fact I have not even configured slurm users:
>
> # sacctmgr show user
>   User   Def Acct Admin
> -- -- -
>   root   root Administ+
> #
>
> So is that the issue? Should I just add all users? Any suggestions on the
> minimal (but robust) way to do that?
>
> Thanks!
>
>
> On Mon, Oct 2, 2023 at 9:20 AM Davide DelVento 
> wrote:
>
>> Thanks Paul, this helps.
>>
>> I don't have any PrivateData line in either config file. According to the
>> docs, "By default, all information is visible to all users" so this should
>> not be an issue. I tried to add a line with "PrivateData=jobs" to the conf
>> files, just in case, but that didn't change the behavior.
>>
>> On Mon, Oct 2, 2023 at 9:10 AM Paul Edmon  wrote:
>>
>>> At least in our setup, users can see their own scripts by doing sacct -B
>>> -j JOBID
>>>
>>> I would make sure that the scripts are being stored and how you have
>>> PrivateData set.
>>>
>>> -Paul Edmon-
>>> On 10/2/2023 10:57 AM, Davide DelVento wrote:
>>>
>>> I deployed the job_script archival and it is working, however it can be
>>> queried only by root.
>>>
>>> A regular user can run sacct -lj towards any jobs (even those by other
>>> users, and that's okay in our setup) with no problem. However if they run
>>> sacct -j job_id --batch-script even against a job they own themselves,
>>> nothing is returned and I get a
>>>
>>> slurmdbd: error: couldn't get information for this user (null)(xx)
>>>
>>> where x is the posix ID of the user who's running the query in the
>>> slurmdbd logs.
>>>
>>> Both configure files slurmdbd.conf and slurm.conf do not have any
>>> "permission" setting. FWIW, we use LDAP.
>>>
>>> Is that the expected behavior, in that by default only root can see the
>>> job scripts? I was assuming the users themselves should be able to debug
>>> their own jobs... Any hint on what could be changed to achieve this?
>>>
>>> Thanks!
>>>
>>>
>>>
>>> On Fri, Sep 29, 2023 at 5:48 AM Davide DelVento <
>>> davide.quan...@gmail.com> wrote:
>>>
>>>> Fantastic, this is really helpful, thanks!
>>>>
>>>> On Thu, Sep 28, 2023 at 12:05 PM Paul Edmon 
>>>> wrote:
>>>>
>>>>> Yes it was later than that. If you are 23.02 you are good.  We've been
>>>>> running with storing job_scripts on for years at this point and that part
>>>>> of the database only uses up 8.4G.  Our entire database takes up 29G on
>>>>> disk. So its about 1/3 of the database.  We also have database compression
>>>>> which helps with the on disk size. Raw uncompressed our database is about
>>>>> 90G.  We keep 6 months of data in our active database.
>>>>>
>>>>> -Paul Edmon-
>>>>>

Re: [slurm-users] enabling job script archival

2023-10-03 Thread Davide DelVento

By increasing the slurmdbd verbosity level, I got additional information,
namely the following:

slurmdbd: error: couldn't get information for this user (null)(xx)
slurmdbd: debug: accounting_storage/as_mysql:
as_mysql_jobacct_process_get_jobs: User  xx  has no associations, and
is not admin, so not returning any jobs.

again where x is the posix ID of the user who's running the query in
the slurmdbd logs.

I suspect this is due to the fact that our userbase is small enough (we are
a department HPC) that we don't need to use allocation and the like, so I
have not configured any association (and not even studied its
configuration, since when I was at another place which did use
associations, someone else took care of slurm administration).

Anyway, I read the fantastic document by our own member at
https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_accounting/#associations
and in fact I have not even configured slurm users:

# sacctmgr show user
  User   Def Acct Admin
-- -- -
  root   root Administ+
#

So is that the issue? Should I just add all users? Any suggestions on the
minimal (but robust) way to do that?

Thanks!


On Mon, Oct 2, 2023 at 9:20 AM Davide DelVento 
wrote:

> Thanks Paul, this helps.
>
> I don't have any PrivateData line in either config file. According to the
> docs, "By default, all information is visible to all users" so this should
> not be an issue. I tried to add a line with "PrivateData=jobs" to the conf
> files, just in case, but that didn't change the behavior.
>
> On Mon, Oct 2, 2023 at 9:10 AM Paul Edmon  wrote:
>
>> At least in our setup, users can see their own scripts by doing sacct -B
>> -j JOBID
>>
>> I would make sure that the scripts are being stored and how you have
>> PrivateData set.
>>
>> -Paul Edmon-
>> On 10/2/2023 10:57 AM, Davide DelVento wrote:
>>
>> I deployed the job_script archival and it is working, however it can be
>> queried only by root.
>>
>> A regular user can run sacct -lj towards any jobs (even those by other
>> users, and that's okay in our setup) with no problem. However if they run
>> sacct -j job_id --batch-script even against a job they own themselves,
>> nothing is returned and I get a
>>
>> slurmdbd: error: couldn't get information for this user (null)(xx)
>>
>> where x is the posix ID of the user who's running the query in the
>> slurmdbd logs.
>>
>> Both configure files slurmdbd.conf and slurm.conf do not have any
>> "permission" setting. FWIW, we use LDAP.
>>
>> Is that the expected behavior, in that by default only root can see the
>> job scripts? I was assuming the users themselves should be able to debug
>> their own jobs... Any hint on what could be changed to achieve this?
>>
>> Thanks!
>>
>>
>>
>> On Fri, Sep 29, 2023 at 5:48 AM Davide DelVento 
>> wrote:
>>
>>> Fantastic, this is really helpful, thanks!
>>>
>>> On Thu, Sep 28, 2023 at 12:05 PM Paul Edmon 
>>> wrote:
>>>
>>>> Yes it was later than that. If you are 23.02 you are good.  We've been
>>>> running with storing job_scripts on for years at this point and that part
>>>> of the database only uses up 8.4G.  Our entire database takes up 29G on
>>>> disk. So its about 1/3 of the database.  We also have database compression
>>>> which helps with the on disk size. Raw uncompressed our database is about
>>>> 90G.  We keep 6 months of data in our active database.
>>>>
>>>> -Paul Edmon-
>>>> On 9/28/2023 1:57 PM, Ryan Novosielski wrote:
>>>>
>>>> Sorry for the duplicate e-mail in a short time: do you know (or anyone)
>>>> when the hashing was added? Was planning to enable this on 21.08, but we
>>>> then had to delay our upgrade to it. I’m assuming later than that, as I
>>>> believe that’s when the feature was added.
>>>>
>>>> On Sep 28, 2023, at 13:55, Ryan Novosielski 
>>>>  wrote:
>>>>
>>>> Thank you; we’ll put in a feature request for improvements in that
>>>> area, and also thanks for the warning? I thought of that in passing, but
>>>> the real world experience is really useful. I could easily see wanting that
>>>> stuff to be retained less often than the main records, which is what I’d
>>>> ask for.
>>>>
>>>> I assume that archiving, in general, would also remove this stuff,
>>>> since old jobs themselves will be removed?
>>>>
>>>> --
>>>> #BlackLivesMatter
>

Re: [slurm-users] enabling job script archival

2023-10-02 Thread Davide DelVento

Thanks Paul, this helps.

I don't have any PrivateData line in either config file. According to the
docs, "By default, all information is visible to all users" so this should
not be an issue. I tried to add a line with "PrivateData=jobs" to the conf
files, just in case, but that didn't change the behavior.

On Mon, Oct 2, 2023 at 9:10 AM Paul Edmon  wrote:

> At least in our setup, users can see their own scripts by doing sacct -B
> -j JOBID
>
> I would make sure that the scripts are being stored and how you have
> PrivateData set.
>
> -Paul Edmon-
> On 10/2/2023 10:57 AM, Davide DelVento wrote:
>
> I deployed the job_script archival and it is working, however it can be
> queried only by root.
>
> A regular user can run sacct -lj towards any jobs (even those by other
> users, and that's okay in our setup) with no problem. However if they run
> sacct -j job_id --batch-script even against a job they own themselves,
> nothing is returned and I get a
>
> slurmdbd: error: couldn't get information for this user (null)(xx)
>
> where x is the posix ID of the user who's running the query in the
> slurmdbd logs.
>
> Both configure files slurmdbd.conf and slurm.conf do not have any
> "permission" setting. FWIW, we use LDAP.
>
> Is that the expected behavior, in that by default only root can see the
> job scripts? I was assuming the users themselves should be able to debug
> their own jobs... Any hint on what could be changed to achieve this?
>
> Thanks!
>
>
>
> On Fri, Sep 29, 2023 at 5:48 AM Davide DelVento 
> wrote:
>
>> Fantastic, this is really helpful, thanks!
>>
>> On Thu, Sep 28, 2023 at 12:05 PM Paul Edmon 
>> wrote:
>>
>>> Yes it was later than that. If you are 23.02 you are good.  We've been
>>> running with storing job_scripts on for years at this point and that part
>>> of the database only uses up 8.4G.  Our entire database takes up 29G on
>>> disk. So its about 1/3 of the database.  We also have database compression
>>> which helps with the on disk size. Raw uncompressed our database is about
>>> 90G.  We keep 6 months of data in our active database.
>>>
>>> -Paul Edmon-
>>> On 9/28/2023 1:57 PM, Ryan Novosielski wrote:
>>>
>>> Sorry for the duplicate e-mail in a short time: do you know (or anyone)
>>> when the hashing was added? Was planning to enable this on 21.08, but we
>>> then had to delay our upgrade to it. I’m assuming later than that, as I
>>> believe that’s when the feature was added.
>>>
>>> On Sep 28, 2023, at 13:55, Ryan Novosielski 
>>>  wrote:
>>>
>>> Thank you; we’ll put in a feature request for improvements in that area,
>>> and also thanks for the warning? I thought of that in passing, but the real
>>> world experience is really useful. I could easily see wanting that stuff to
>>> be retained less often than the main records, which is what I’d ask for.
>>>
>>> I assume that archiving, in general, would also remove this stuff, since
>>> old jobs themselves will be removed?
>>>
>>> --
>>> #BlackLivesMatter
>>> 
>>> || \\UTGERS,
>>> |---*O*---
>>> ||_// the State  | Ryan Novosielski - novos...@rutgers.edu
>>> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~
>>> RBHS Campus
>>> ||  \\of NJ  | Office of Advanced Research Computing - MSB
>>> A555B, Newark
>>>  `'
>>>
>>> On Sep 28, 2023, at 13:48, Paul Edmon 
>>>  wrote:
>>>
>>> Slurm should take care of it when you add it.
>>>
>>> So far as horror stories, under previous versions our database size
>>> ballooned to be so massive that it actually prevented us from upgrading and
>>> we had to drop the columns containing the job_script and job_env.  This was
>>> back before slurm started hashing the scripts so that it would only store
>>> one copy of duplicate scripts.  After this point we found that the
>>> job_script database stayed at a fairly reasonable size as most users use
>>> functionally the same script each time. However the job_env continued to
>>> grow like crazy as there are variables in our environment that change
>>> fairly consistently depending on where the user is. Thus job_envs ended up
>>> being too massive to keep around and so we had to drop them. Frankly we
>>> never really used them for debugging. The job_scripts though are super
>>> useful and not that much overhead.
&g

Re: [slurm-users] enabling job script archival

2023-10-02 Thread Davide DelVento

I deployed the job_script archival and it is working, however it can be
queried only by root.

A regular user can run sacct -lj towards any jobs (even those by other
users, and that's okay in our setup) with no problem. However if they run
sacct -j job_id --batch-script even against a job they own themselves,
nothing is returned and I get a

slurmdbd: error: couldn't get information for this user (null)(xx)

where x is the posix ID of the user who's running the query in the
slurmdbd logs.

Both configure files slurmdbd.conf and slurm.conf do not have any
"permission" setting. FWIW, we use LDAP.

Is that the expected behavior, in that by default only root can see the job
scripts? I was assuming the users themselves should be able to debug their
own jobs... Any hint on what could be changed to achieve this?

Thanks!



On Fri, Sep 29, 2023 at 5:48 AM Davide DelVento 
wrote:

> Fantastic, this is really helpful, thanks!
>
> On Thu, Sep 28, 2023 at 12:05 PM Paul Edmon 
> wrote:
>
>> Yes it was later than that. If you are 23.02 you are good.  We've been
>> running with storing job_scripts on for years at this point and that part
>> of the database only uses up 8.4G.  Our entire database takes up 29G on
>> disk. So its about 1/3 of the database.  We also have database compression
>> which helps with the on disk size. Raw uncompressed our database is about
>> 90G.  We keep 6 months of data in our active database.
>>
>> -Paul Edmon-
>> On 9/28/2023 1:57 PM, Ryan Novosielski wrote:
>>
>> Sorry for the duplicate e-mail in a short time: do you know (or anyone)
>> when the hashing was added? Was planning to enable this on 21.08, but we
>> then had to delay our upgrade to it. I’m assuming later than that, as I
>> believe that’s when the feature was added.
>>
>> On Sep 28, 2023, at 13:55, Ryan Novosielski 
>>  wrote:
>>
>> Thank you; we’ll put in a feature request for improvements in that area,
>> and also thanks for the warning? I thought of that in passing, but the real
>> world experience is really useful. I could easily see wanting that stuff to
>> be retained less often than the main records, which is what I’d ask for.
>>
>> I assume that archiving, in general, would also remove this stuff, since
>> old jobs themselves will be removed?
>>
>> --
>> #BlackLivesMatter
>> 
>> || \\UTGERS,
>> |---*O*---
>> ||_// the State  | Ryan Novosielski - novos...@rutgers.edu
>> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~
>> RBHS Campus
>> ||  \\of NJ  | Office of Advanced Research Computing - MSB
>> A555B, Newark
>>  `'
>>
>> On Sep 28, 2023, at 13:48, Paul Edmon 
>>  wrote:
>>
>> Slurm should take care of it when you add it.
>>
>> So far as horror stories, under previous versions our database size
>> ballooned to be so massive that it actually prevented us from upgrading and
>> we had to drop the columns containing the job_script and job_env.  This was
>> back before slurm started hashing the scripts so that it would only store
>> one copy of duplicate scripts.  After this point we found that the
>> job_script database stayed at a fairly reasonable size as most users use
>> functionally the same script each time. However the job_env continued to
>> grow like crazy as there are variables in our environment that change
>> fairly consistently depending on where the user is. Thus job_envs ended up
>> being too massive to keep around and so we had to drop them. Frankly we
>> never really used them for debugging. The job_scripts though are super
>> useful and not that much overhead.
>>
>> In summary my recommendation is to only store job_scripts. job_envs add
>> too much storage for little gain, unless your job_envs are basically the
>> same for each user in each location.
>>
>> Also it should be noted that there is no way to prune out job_scripts or
>> job_envs right now. So the only way to get rid of them if they get large is
>> to 0 out the column in the table. You can ask SchedMD for the mysql command
>> to do this as we had to do it here to our job_envs.
>>
>> -Paul Edmon-
>>
>> On 9/28/2023 1:40 PM, Davide DelVento wrote:
>>
>> In my current slurm installation, (recently upgraded to slurm v23.02.3),
>> I only have
>>
>> AccountingStoreFlags=job_comment
>>
>> I now intend to add both
>>
>> AccountingStoreFlags=job_script
>> AccountingStoreFlags=job_env
>>
>> leaving the default 4MB value for max_script_size
>>
>> Do I need to do anything on the DB myself, or will slurm take care of the
>> additional tables if needed?
>>
>> Any comments/suggestions/gotcha/pitfalls/horror_stories to share? I know
>> about the additional diskspace and potentially load needed, and with our
>> resources and typical workload I should be okay with that.
>>
>> Thanks!
>>
>>
>>
>>
>>

Re: [slurm-users] Verifying preemption WON'T happen

2023-09-29 Thread Davide DelVento

I don't really have an answer for you other than a "hallway comment", that
it sounds like a good thing which I would test with a simulator, if I had
one. I've been intrigued by (but really not looked much into)
https://slurm.schedmd.com/SLUG23/LANL-Batsim-SLUG23.pdf

On Fri, Sep 29, 2023 at 10:05 AM Groner, Rob  wrote:

> On our system, for some partitions, we guarantee that a job can run at
> least an hour before being preempted by a higher priority job.  We use the
> QOS preempt exempt time for this, and it appears to be working.  But of
> course, I want to TEST that it works.
>
> So on a test system, I start a lower priority job on a specific node, wait
> until it starts running, and then I start a higher priority job for the
> same node.  The test should only pass if the higher priority job has an
> OPPORTUNITY to preempt the lower priority job, and doesn't.
>
> Now, I know I can get a preempt eligible time out of scontrol for the
> lower priority job and verify that it's set for an hour (I do check that
> already), but that's not good enough for me.  I could obviously let the
> test run for an hour to verify the lower priority job was never
> preempted...but that's not really feasible.  So instead, I want to verify
> that the higher priority job has had a chance to preempt the lower priority
> job, and it did not.
>
> So far, the way I've been doing that is to check the reported Scheduler in
> the scontrol job output for the higher priority job.  I figure that when
> the scheduler changes to Backfill instead of Main, then the higher priority
> job has been seen by the main scheduler and it passed on the chance to
> preempt the lower priority job.
>
> Is that a good assumption?  Is there any other, or potentially quicker,
> way to verify that the higher priority job will NOT preempt the lower
> priority job?
>
> Rob
>

Re: [slurm-users] enabling job script archival

2023-09-29 Thread Davide DelVento

Fantastic, this is really helpful, thanks!

On Thu, Sep 28, 2023 at 12:05 PM Paul Edmon  wrote:

> Yes it was later than that. If you are 23.02 you are good.  We've been
> running with storing job_scripts on for years at this point and that part
> of the database only uses up 8.4G.  Our entire database takes up 29G on
> disk. So its about 1/3 of the database.  We also have database compression
> which helps with the on disk size. Raw uncompressed our database is about
> 90G.  We keep 6 months of data in our active database.
>
> -Paul Edmon-
> On 9/28/2023 1:57 PM, Ryan Novosielski wrote:
>
> Sorry for the duplicate e-mail in a short time: do you know (or anyone)
> when the hashing was added? Was planning to enable this on 21.08, but we
> then had to delay our upgrade to it. I’m assuming later than that, as I
> believe that’s when the feature was added.
>
> On Sep 28, 2023, at 13:55, Ryan Novosielski 
>  wrote:
>
> Thank you; we’ll put in a feature request for improvements in that area,
> and also thanks for the warning? I thought of that in passing, but the real
> world experience is really useful. I could easily see wanting that stuff to
> be retained less often than the main records, which is what I’d ask for.
>
> I assume that archiving, in general, would also remove this stuff, since
> old jobs themselves will be removed?
>
> --
> #BlackLivesMatter
> 
> || \\UTGERS, |---*O*---
> ||_// the State  | Ryan Novosielski - novos...@rutgers.edu
> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
> ||  \\of NJ  | Office of Advanced Research Computing - MSB
> A555B, Newark
>  `'
>
> On Sep 28, 2023, at 13:48, Paul Edmon 
>  wrote:
>
> Slurm should take care of it when you add it.
>
> So far as horror stories, under previous versions our database size
> ballooned to be so massive that it actually prevented us from upgrading and
> we had to drop the columns containing the job_script and job_env.  This was
> back before slurm started hashing the scripts so that it would only store
> one copy of duplicate scripts.  After this point we found that the
> job_script database stayed at a fairly reasonable size as most users use
> functionally the same script each time. However the job_env continued to
> grow like crazy as there are variables in our environment that change
> fairly consistently depending on where the user is. Thus job_envs ended up
> being too massive to keep around and so we had to drop them. Frankly we
> never really used them for debugging. The job_scripts though are super
> useful and not that much overhead.
>
> In summary my recommendation is to only store job_scripts. job_envs add
> too much storage for little gain, unless your job_envs are basically the
> same for each user in each location.
>
> Also it should be noted that there is no way to prune out job_scripts or
> job_envs right now. So the only way to get rid of them if they get large is
> to 0 out the column in the table. You can ask SchedMD for the mysql command
> to do this as we had to do it here to our job_envs.
>
> -Paul Edmon-
>
> On 9/28/2023 1:40 PM, Davide DelVento wrote:
>
> In my current slurm installation, (recently upgraded to slurm v23.02.3), I
> only have
>
> AccountingStoreFlags=job_comment
>
> I now intend to add both
>
> AccountingStoreFlags=job_script
> AccountingStoreFlags=job_env
>
> leaving the default 4MB value for max_script_size
>
> Do I need to do anything on the DB myself, or will slurm take care of the
> additional tables if needed?
>
> Any comments/suggestions/gotcha/pitfalls/horror_stories to share? I know
> about the additional diskspace and potentially load needed, and with our
> resources and typical workload I should be okay with that.
>
> Thanks!
>
>
>
>
>

[slurm-users] enabling job script archival

2023-09-28 Thread Davide DelVento

In my current slurm installation, (recently upgraded to slurm v23.02.3), I
only have

AccountingStoreFlags=job_comment

I now intend to add both

AccountingStoreFlags=job_script
AccountingStoreFlags=job_env

leaving the default 4MB value for max_script_size

Do I need to do anything on the DB myself, or will slurm take care of the
additional tables if needed?

Any comments/suggestions/gotcha/pitfalls/horror_stories to share? I know
about the additional diskspace and potentially load needed, and with our
resources and typical workload I should be okay with that.

Thanks!

[slurm-users] slurmrestd memory leak

2023-08-22 Thread Davide DelVento

Has anyone else noticed this issue and knows more about it?

https://bugs.schedmd.com/show_bug.cgi?id=16976

Mitigation by preventing users submitting many jobs works, but only to a
point.

Re: [slurm-users] Tracking efficiency of all jobs on the cluster (dashboard etc.)

2023-07-24 Thread Davide DelVento

I run a cluster we bought from ACT and recently updated to ClusterVisor v1.0

The new version has (among many things) a really nice view of individual
jobs resource utilization (GPUs, memory, CPU, temperature, etc). I did not
pay attention to the overall statistics, so I am not sure how CV fares
there -- because I care only about individual jobs (I work with individual
users, and don't care about overall utilization, which is info for the
upper management). At the moment only admins can see the info, but my
understanding is that they are considering making it a user-space feature,
which will be really slick.

Several years ago I used XDMOD and Supremm and it was more confusing to use
and had troubles collecting all the data we needed (which the team blamed
on some BIOS settings), so the view was incomplete. Also, the tool seemed
to be more focused on the overall stats rather than per job info (both were
available, but the focus seemed on the former). I am sure these tools have
improved since then, so I'm not dismissing them, just giving my opinion
based on old facts. Comparing that old version of XDMOD to current CV
(unfair, I know, but that's the comparison I've got) the latter wins hands
down for per-job information. Also probably unfair is that XDMOD and
Supremm are free and open source whereas CV is proprietary.

On Mon, Jul 24, 2023 at 2:57 PM Magnus Jonsson 
wrote:

> We are feeding job usage information into a Prometheus database for our
> users (and us) to look at (via Grafana).
>
> It is also possible to get a lite of jobs that are under using memory, gpu
> or whatever metric you feed into the database.
>
>
>
> It’s a live feed with ~30s resolution from both compute jobs and Lustre
> file system.
>
> It’s easy to extend with more metrices.
>
>
>
> If you want more information on what we are doing just send me an email
> and I can give you more information.
>
>
>
> /Magnus
>
>
>
> --
>
> Magnus Jonsson, Developer, HPC2N, Umeå Universitet
>
> By sending an email to Umeå University, the University will need to
>
> process your personal data. For more information, please read
> www.umu.se/en/gdpr
>
> *Från:* slurm-users  *För *Will
> Furnell - STFC UKRI
> *Skickat:* Monday, 24 July 2023 16:38
> *Till:* slurm-us...@schedmd.com
> *Ämne:* [slurm-users] Tracking efficiency of all jobs on the cluster
> (dashboard etc.)
>
>
>
> Hello,
>
>
>
> I am aware of ‘seff’, which allows you to check the efficiency of a single
> job, which is good for users, but as a cluster administrator I would like
> to be able to track the efficiency of all jobs from all users on the
> cluster, so I am able to ‘re-educate’ users that may be running jobs that
> have terrible resource usage efficiency.
>
>
>
> What do other cluster administrators use for this task? Is there anything
> you use and recommend (or don’t recommend) or have heard of that is able to
> do this? Even if it’s something like a Grafana dashboard that hooks up to
> the SLURM database,
>
>
>
> Thank you,
>
>
>
> Will.
>

Re: [slurm-users] Decreasing time limit of running jobs (notification)

2023-07-10 Thread Davide DelVento

Actually rm -r does not give ANY warning, so in plain Linux "rm -r /" run
as root would destroy your system without notice. Your particular Linux
distro may have implemented safeguards with a shell alias such as `alias
rm='rm -i'` and that's a common thing, but not guaranteed to be there

On Thu, Jul 6, 2023 at 11:40 AM Amjad Syed  wrote:

> Agreed the point  of greater  responsibility  but  even rm -r  ( without
> f) gives  a warning.  In this case should slurm have that  option (
> forced)   especially  if  it can immediately  kill a running  job?
>
>
>
>
>
> On Thu, 6 Jul 2023, 18:16 Jason Simms,  wrote:
>
>> An unfortunate example of the “with great power comes great
>> responsibility” maxim. Linux will gleefully let you rm -fr your entire
>> system, drop production databases, etc., provided you have the right
>> privileges. Ask me how I know…
>>
>> Still, I get the point. Would it be possible to somehow ask for
>> confirmation if you are setting a max time that is less than the current
>> walltime? Perhaps. Could you script that yourself? Yes, I’m certain of it.
>> Those kind of built-in safeguards aren’t super common, however.
>>
>> Jason
>>
>> On Thu, Jul 6, 2023 at 12:55 PM Amjad Syed  wrote:
>>
>>> Yes, the initial End Time was 7-00:00:00 but it allowed the typo
>>> (16:00:00) which caused the jobs to be killed without warning
>>>
>>> Amjad
>>>
>>> On Thu, Jul 6, 2023 at 5:27 PM Bernstein, Noam CIV USN NRL (6393)
>>> Washington DC (USA)  wrote:
>>>
 Is the issue that the error in the time made it shorter than the time
 the job had already run, so it killed it immediately?

 On Jul 6, 2023, at 12:04 PM, Jason Simms 
 wrote:

 No, not a bug, I would say. When the time limit is reached, that's it,
 job dies. I wouldn't be aware of a way to manage that. Once the time limit
 is reached, it wouldn't be a hard limit if you then had to notify the user
 and then... what? How long would you give them to extend the time? Wouldn't
 be much of a limit if a job can be extended, plus that would throw off the
 scheduler/estimator. I'd chalk it up to an unfortunate typo.

 Jason

 On Thu, Jul 6, 2023 at 11:54 AM Amjad Syed  wrote:

> Hello
>
> We were trying to increase the time limit of a slurm running job
>
> scontrol update job= TimeLimit=16-00:00:00
>
> But we accidentally got it to 16 hours
>
> scontrol update job= TimeLimit=16:00:00
>
> This actually timeout and killed the running job and did not give any
> notification
>
> Is this a bug, should not the user be warned that this job will be
> killled ?
>
> Amjad
>
>

 --
 *Jason L. Simms, Ph.D., M.P.H.*
 Manager of Research Computing
 Swarthmore College
 Information Technology Services
 (610) 328-8102
 Schedule a meeting: https://calendly.com/jlsimms








 *U.S. NAVAL *


 *RESEARCH *

 LABORATORY
 Noam Bernstein, Ph.D.
 Center for Materials Physics and Technology
 U.S. Naval Research Laboratory
 T +1 202 404 8628 F +1 202 404 7546
 https://www.nrl.navy.mil


 --
>> *Jason L. Simms, Ph.D., M.P.H.*
>> Manager of Research Computing
>> Swarthmore College
>> Information Technology Services
>> (610) 328-8102
>> Schedule a meeting: https://calendly.com/jlsimms
>>
>

Re: [slurm-users] Nodes stuck in drain state

2023-05-25 Thread Davide DelVento

Can you ssh into the node and check the actual availability of memory?
Maybe there is a zombie process (or a healthy one with a memory leak bug)
that's hogging all the memory?

On Thu, May 25, 2023 at 7:31 AM Roger Mason  wrote:

> Hello,
>
> Doug Meyer  writes:
>
> > Could also review the node log in /varlog/slurm/ .  Often sinfo -lR will
> tell you the cause, fro example mem not matching the config.
> >
> REASON   USER TIMESTAMP   STATE  NODELIST
> Low RealMemory   slurm(468)   2023-05-25T09:26:59 drain* node012
> Not responding   slurm(468)   2023-05-25T09:30:31 down*
> node[001-003,008]
>
> But, as I sail in my response to Ole, the memory in slurm.conf and in
> the 'show node' output match.
>
> Many thanks for the help.
>
> Roger
>
>

Re: [slurm-users] monitoring and accounting

2023-05-05 Thread Davide DelVento

At a place I worked before, we used XDMOD several years ago. It was a bit
tricky to set up correctly and not exactly intuitive to get started with
data collection as a user (managers, allocation specialists and
other not-super-technical people were most of our users). But when
familiarized with it, it worked great.
At the place I work now, monitoring and accounting is low on our priority
list, so it's been a while I haven't touched XDMOD. Hopefully now they have
improved user and administration friendliness, while keeping all the great
things that it could do.

On Fri, May 5, 2023 at 7:08 AM LEROY Christine 208562 <
christine.ler...@cea.fr> wrote:

> Hello Everyone,
>
> We would like to improve our visibility on our cluster usage.
>
> We have ganglia, and use sacct actually, but I was wondering if there was
> a web tool recommended to have both monitoring and accounting (user and
> admin friendly) ?
>
> Thanks in advance
>
> Christine
>
>
>
>
>
>
>

Re: [slurm-users] sharing licences with non slurm workers

2023-03-24 Thread Davide DelVento

Ciao Matteo,

If you look through the archives, you will see I struggled with this
problem too. A few people suggested some alternatives, but in the end
I did not find anything really satisfying which did not require a ton
of work for me.

Another piece of the story is users requesting a license but not using
it (I don't think there is a solution there, other than perhaps make
them "pay" something with their allocation to discourage the problem)
and others NOT requesting the license but using it nevertheless,
throwing up slurm count. I think the latter could be solved
programmatically in slurm, but it felt like too much work compared to
simply educating the users... Depending on your userbase (and their
well-behaveness) you might or might not have this issue.

Cheers,
Davide

On Fri, Mar 24, 2023 at 8:06 AM Matteo Guglielmi
 wrote:
>
> Dear all,
>
>
> we have a license server which is allocating licenses to a bunk of workstation
>
> not managed with slurm (completely independent boxes) and the nodes of a 
> cluster,
>
> all managed with slurm.
>
>
> I wrote a simple script that keeps querying the number of licenses used by the
>
> outside "world" and changes the total number of available license in the slurm
>
> database.
>
>
> Everything works as expected except when all licenses are used outside of 
> slurm.
>
>
> When this occurs, the total number of licenses is set to zero and slurm 
> refuses
>
> to accept any job asking for one or more licenses with this error message:
>
>
> "sbatch: error: Batch job submission failed: Invalid license specification"
>
>
> So,
>
>
> is there a way to configure slurm to still accept and queue jobs even when the
>
> total number of licenses is set temporarily to zero?
>
>
> If my approach is not correct,
>
>
> is there a way to share a common license server between slurm workers and non
>
> slurm workers?
>
>
> Thank you.
>

Re: [slurm-users] speed / efficiency of sacct vs. scontrol

2023-02-27 Thread Davide DelVento

> > And if you are seeing a workflow management system causing trouble on
> > your system, probably the most sustainable way of getting this resolved
> > is to file issues or pull requests with the respective project, with
> > suggestions like the ones you made. For snakemake, a second good point
> > to currently chime in, would be the issue discussing Slurm job array
> > support: https://github.com/snakemake/snakemake/issues/301
>
> I have to disagree here.  I think the onus is on the people in a given
> community to ensure that their software behaves well on the systems they
> want to use, not on the operators of those system.  Those of us running
> HPC systems often have to deal with a very large range of different
> pieces of software and time and personell are limited.  If some program
> used by only a subset of the users is causing disruption, then it
> already costs us time and energy to mitigate those effects.  Even if I
> had the appropriate skill set, I don't see my self be writing many
> patches for workflow managers any time soon.

As someone who has worked in both roles (and to a degree still is) and
therefore can better understand the perspective from both parties, I
side more with David than with Loris here.

Yes, David wrote "or pull requests", but that's an OR.

Loris, if you know or experience a problem, it takes close to zero
time to file a bug report educating the author of the software about
the problem (or pointing them to places where they can educate
themselves). Otherwise they will never know about it, they will never
fix it, and potentially they think it's fine and will make the problem
worse. Yes, you could alternatively forbid the use of the problematic
software on the machine (I've done that on our systems), but users
with those needs will find ways to create the very same problem, and
perhaps worse, in other ways (they have done it on our system). Yes,
time is limited, and as operators of HPC systems we often don't have
the time to understand all the nuances and needs of all the users, but
that's not the point I am advocating. In fact it does seem to me that
David is putting the onus on himself and his community to make the
software behave correctly, and he is trying to educate himself about
what "correct" is like. So just give him the input he's looking for,
both here and (if and when snakemake causes troubles on your system)
by opening tickets on that repo, explaining the problem (definitely
not writing a PR for you, sorry David)

Re: [slurm-users] [External] Re: actual time of start (or finish) of a job

2023-02-20 Thread Davide DelVento

Thanks for pointing that out. In this case it doesn't matter, but I
can see how in others it may

On Mon, Feb 20, 2023 at 9:44 AM Florian Zillner  wrote:
>
> Hi,
>
> note that times reported by sacct may differ from the net times. For example, 
> imagine a test job like this:
> date
> sleep 1m
> date
>
> sacct reports:
> $ sacct -j 225145 -X -o jobid,start,end
> JobID  Start End
>  --- ---
> 225145   2023-02-20T17:31:12 2023-02-20T17:32:30
>
> Whereas:
> cat *out.225145
> Mon Feb 20 17:31:29 CET 2023
> Mon Feb 20 17:32:29 CET 2023
>
> Sometimes these extra few seconds matter. So if you're looking for net 
> runtimes, I'd suggest to ask the user to include a date command here and 
> there in the submit script.
>
> Cheers,
> Florian
> 
> From: slurm-users  on behalf of Davide 
> DelVento 
> Sent: Thursday, 16 February 2023 01:40
> To: Slurm User Community List 
> Subject: [External] Re: [slurm-users] actual time of start (or finish) of a 
> job
>
> Thanks, that's exactly it.
> I naively assumed that the '-l" in sacct provided "everything" (given
> how long and unwieldy it is, but I noticed now that it isn't).
> Sorry for the noise!
>
> On Wed, Feb 15, 2023 at 5:32 PM Joseph Francisco Guzman
>  wrote:
> >
> > Hi Davide,
> >
> > I would use the Start and End fields with the sacct command. Something like 
> > this: "sacct -j jobid1,jobid2 -X -P -o jobid,start,end".
> >
> > Were you able to take a look at the sacct manual page outlines what all of 
> > the different fields mean? Here's a link to the web version: 
> > https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fsacct.html=05%7C01%7Cfzillner%40lenovo.com%7C15763a6fc38f499923ff08db0fb6a408%7C5c7d0b28bdf8410caa934df372b16203%7C0%7C0%7C638121049348286424%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=unxmMQ7DKiu5WGEiAZspSKoHssaaJy3zPvqVzOYDNZM%3D=0.
> >
> > Best,
> >
> > Joseph
> >
> > ------
> > Joseph F. Guzman - ITS (Advanced Research Computing)
> >
> > Northern Arizona University
> >
> > joseph.f.guz...@nau.edu
> >
> > 
> > From: slurm-users  on behalf of 
> > Davide DelVento 
> > Sent: Wednesday, February 15, 2023 5:18 PM
> > To: slurm-us...@schedmd.com 
> > Subject: [slurm-users] actual time of start (or finish) of a job
> >
> > I have a user who needs to find the actual start (or finish) time of a
> > number of jobs.
> > With the elapsed field of sacct start or finish become equivalent for
> > his search.
> >
> > I see that information in /var/log/slurm/slurmctld.log so Slurm should
> > have it, however in sacct itself that information does not seem to
> > exist, and with all the queries we tried Google always thinks we are
> > looking for something else and never returns an actual answer.
> >
> > If this was a one-off I could do it for him, but he needs to script it
> > for his reasons and I don't want to run his script as root nor give
> > him access to the log files forever.
> >
> > Is there a way to find this information?
> >
> > Thanks
> >
>

Re: [slurm-users] actual time of start (or finish) of a job

2023-02-15 Thread Davide DelVento

Thanks, that's exactly it.
I naively assumed that the '-l" in sacct provided "everything" (given
how long and unwieldy it is, but I noticed now that it isn't).
Sorry for the noise!

On Wed, Feb 15, 2023 at 5:32 PM Joseph Francisco Guzman
 wrote:
>
> Hi Davide,
>
> I would use the Start and End fields with the sacct command. Something like 
> this: "sacct -j jobid1,jobid2 -X -P -o jobid,start,end".
>
> Were you able to take a look at the sacct manual page outlines what all of 
> the different fields mean? Here's a link to the web version: 
> https://slurm.schedmd.com/sacct.html.
>
> Best,
>
> Joseph
>
> --
> Joseph F. Guzman - ITS (Advanced Research Computing)
>
> Northern Arizona University
>
> joseph.f.guz...@nau.edu
>
> 
> From: slurm-users  on behalf of Davide 
> DelVento 
> Sent: Wednesday, February 15, 2023 5:18 PM
> To: slurm-us...@schedmd.com 
> Subject: [slurm-users] actual time of start (or finish) of a job
>
> I have a user who needs to find the actual start (or finish) time of a
> number of jobs.
> With the elapsed field of sacct start or finish become equivalent for
> his search.
>
> I see that information in /var/log/slurm/slurmctld.log so Slurm should
> have it, however in sacct itself that information does not seem to
> exist, and with all the queries we tried Google always thinks we are
> looking for something else and never returns an actual answer.
>
> If this was a one-off I could do it for him, but he needs to script it
> for his reasons and I don't want to run his script as root nor give
> him access to the log files forever.
>
> Is there a way to find this information?
>
> Thanks
>

[slurm-users] actual time of start (or finish) of a job

2023-02-15 Thread Davide DelVento

I have a user who needs to find the actual start (or finish) time of a
number of jobs.
With the elapsed field of sacct start or finish become equivalent for
his search.

I see that information in /var/log/slurm/slurmctld.log so Slurm should
have it, however in sacct itself that information does not seem to
exist, and with all the queries we tried Google always thinks we are
looking for something else and never returns an actual answer.

If this was a one-off I could do it for him, but he needs to script it
for his reasons and I don't want to run his script as root nor give
him access to the log files forever.

Is there a way to find this information?

Thanks

Re: [slurm-users] [EXT]Re: srun with &&, |, and > oh my!

2023-01-24 Thread Davide DelVento

Try first with small things like shell scripts you write which would
tell you where the thing is running (e.g. by using hostname). Keep in
mind that what would happen will most importantly depend on the shell.
For example, if you use "sudo" you know that using wildcards is
tricky, because your username you sudo from may not have reading
permissions to expand them, so will pass them literally to the
command, rather than expanding them as root.

Bottom line, good to be afraid. Understand shell syntax, but don't
trust and instead verify what happens with innocuous things before
opening the firehose :-)

On Mon, Jan 23, 2023 at 9:31 PM Chandler  wrote:
>
> Williams, Gareth (IM, Black Mountain) wrote on 1/23/23 7:55 PM:
> > Be brave and experiment! How far wrong can you go?
> Hmm I do love breaking and re-fixing things...
>
> > srun bash -c "cmd1 infile1 | cmd2 opt2 arg2 | cmd3 opt3 arg3 -- > outfile 
> > && cmd4 opt4 arg4"
> Yes this will work!  Thanks!
>

Re: [slurm-users] How to read job accounting data long output? `sacct -l`

2022-12-14 Thread Davide DelVento

It would be very useful if there were a way (perhaps a custom script
parsing the sacct output) to provide the information in the same
format as "scontrol show job"

Has anybody attempted to do that?


On Wed, Dec 14, 2022 at 1:25 AM Will Furnass  wrote:
>
> If you pipe output into 'less -S' then you get horizontal scrolling.
>
> Will
>
> On Wed, 14 Dec 2022, 07:03 Chandler Sobel-Sorenson, 
>  wrote:
>>
>> Is there a recommended way to read output from `sacct` involving `-l` or 
>> `--long` option?  I have dual monitors and shrunk the terminal's font down 
>> to 6 pt or so until I could barely read it, giving me 675 columns.  This was 
>> still not enough...
>>
>> Perhaps there is a way of displaying it so the lines don't wrap and I can 
>> use left/right arrow keys to scroll the output, much like `systemctl` and 
>> `journalctl` can do?
>>
>> Perhaps there is a way to import it into a spreadsheet?
>>
>> This was with version 19.05 at least.  Apologies if the output has changed 
>> in newer versions...
>>
>> Thanks
>>
>>

Re: [slurm-users] Prolog and job_submit

2022-10-31 Thread Davide DelVento

Thanks for helping me find workarounds.

> My only other thought is that you might be able to use node features &
> job constraints to communicate this without the user realising.

I am not sure I understand this approach.

> For instance you could declare the nodes where the software is installed
> to have "Feature=mysoftware" and then your job submit could spot users
> requesting the license and add the constraint "mysoftware" to their job.
> The (root privileged) Prolog can see that via the SLURM_JOB_CONSTRAINTS
> environment variable and so could react to it.

Are you saying that if the job_submit.lua can't directly add an
environmental variable that the prolog can see, but can add the
constraint which will become an environmental variable that the prolog
can see?
Would that work if that feature is available in all nodes?

Re: [slurm-users] Prolog and job_submit

2022-10-30 Thread Davide DelVento

Hi Chris,

> Unfortunately it looks like the license request information doesn't get
> propagated into any prologs from what I see from a scan of the
> documentation. :-(

Thanks. If I am reading you right, I did notice the same thing and in
fact that's why I wrote that  job_submit lua script which gets the
license information and sets an environmental variable, in the hope
that such a variable would be inherited by the prolog script.

But if I understand correctly your Prolog vs TaskProlog distinction,
the latter would have the environmental variable and run as user,
whereas the former runs as root and doesn't get the environment, not
even from the job_submit script. The problem with a TaskProlog
approach is that what I want to do (making a non-accessible file
available) would work best as root. As a workaround is that I could
make that just obscure but still user-possible. Not ideal, but better
than nothing as it is now.

Alternatively, I could use another way to let the job_submit lua
script communicate with the Prolog, not sure exactly what (temp
directory on the shared filesystem, writeable only by root??)

Thanks for pointing to that commit. I bit too down the road but good to know.

Cheers,
Davide

Re: [slurm-users] Prolog and job_submit

2022-10-29 Thread Davide DelVento

Thanks Jeff.
That's exactly the documentation that I looked and quoted, and yes, I
know that the user running the prolog is a different one (root) from
the one which will be running the job (regular user submitting the
job).
I speculated that the sentence I quoted (again: prolog is executed
with the same environment as the user tasks to be initiated) meant
that root had the user environment, besides PATH for security reasons
as written in that page. Apparently that is not the case, and as such
I am stuck and can't solve my problem: if even I find an appropriate
prolog which runs as a regular user, then it won't have the necessary
permissions to unhide the licensed binary. One step forward and two
steps backward, so frustrating!
Thanks again and have a nice weekend

On Sat, Oct 29, 2022 at 11:06 AM Sarlo, Jeffrey S  wrote:
>
> Not sure if this will help.  It has which user will execute the scripts
>
> https://slurm.schedmd.com/prolog_epilog.html
>
> Maybe the variable isn't set for the user executing the 
> prolog/epilog/taskprolog
>
> Jeff
>
> 
> From: slurm-users  on behalf of Davide 
> DelVento 
> Sent: Saturday, October 29, 2022 9:37 AM
> To: slurm-us...@schedmd.com 
> Subject: [slurm-users] Prolog and job_submit
>
> My problem: grant licensed software availability to my users only if
> they request it on slurm; for now with local licenses.
>
> I wrote a job_submit lua script which checks job_desc.licenses and if
> it contains the appropriate strings it sets an appropriate
> SOMETHING_LICENSE_REQ environmental variable.
>
> This part works, I can see the environmental variable correctly set in
> the jobs that require the license.
>
> Now this licensed software is a bit tricky to manage, so the way that
> I thought to use it is simply to make its binary disappear from the
> nodes when not requested, with a prolog and epilog scripts which copy
> it from a location in the shared filesystem accessible only by root.
> Simply a copy during the prolog and a delete during the epilog.
>
> Something like this:
>
> if [[ $SOMETHING_LICENSE_REQ == 1 ]]; then
> # copy the binary
> fi
>
> After banging my head to make the prolog run (executable bit and full
> path required, and not said so in the documentation and error logs
> being cryptic about it) I am finally able to see it running only
> to find out that the SOMETHING_LICENSE_REQ environmental variable is
> not set, despite the documentation at
> https://urldefense.com/v3/__https://slurm.schedmd.com/prolog_epilog.html__;!!LkSTlj0I!GeYFMK5LTz38C-Y5efzywysYqnEPMq4Rq9Nj77WE3gOEYPJ-rRbnBR6alpC8cPi6XJbsBhkAgFOFbEIzzx4ItgH_Ssba$
>stating
>
> > The task prolog is executed with the same environment as the user tasks to 
> > be initiated.
>
> Now, I'd be very happy to do this copy from the job_submit, but that
> is run on the head node (I checked) and so I can't do that. It would
> seem strange that the job_submit is run after the prolog, since the
> latter runs on the compute node (I checked that too).
>
> Moreover, I also verified with additional environmental variables
> which I set at submit time are available for the user job correctly,
> but not in the prolog.
>
> So either I misinterpreted that "same environment as the user tasks"
> or there is something else that I am doing wrong.
>
> Does anybody have any insight?
>

[slurm-users] Prolog and job_submit

2022-10-29 Thread Davide DelVento

My problem: grant licensed software availability to my users only if
they request it on slurm; for now with local licenses.

I wrote a job_submit lua script which checks job_desc.licenses and if
it contains the appropriate strings it sets an appropriate
SOMETHING_LICENSE_REQ environmental variable.

This part works, I can see the environmental variable correctly set in
the jobs that require the license.

Now this licensed software is a bit tricky to manage, so the way that
I thought to use it is simply to make its binary disappear from the
nodes when not requested, with a prolog and epilog scripts which copy
it from a location in the shared filesystem accessible only by root.
Simply a copy during the prolog and a delete during the epilog.

Something like this:

if [[ $SOMETHING_LICENSE_REQ == 1 ]]; then
# copy the binary
fi

After banging my head to make the prolog run (executable bit and full
path required, and not said so in the documentation and error logs
being cryptic about it) I am finally able to see it running only
to find out that the SOMETHING_LICENSE_REQ environmental variable is
not set, despite the documentation at
https://slurm.schedmd.com/prolog_epilog.html stating

> The task prolog is executed with the same environment as the user tasks to be 
> initiated.

Now, I'd be very happy to do this copy from the job_submit, but that
is run on the head node (I checked) and so I can't do that. It would
seem strange that the job_submit is run after the prolog, since the
latter runs on the compute node (I checked that too).

Moreover, I also verified with additional environmental variables
which I set at submit time are available for the user job correctly,
but not in the prolog.

So either I misinterpreted that "same environment as the user tasks"
or there is something else that I am doing wrong.

Does anybody have any insight?

Re: [slurm-users] How to debug a prolog script?

2022-10-29 Thread Davide DelVento

Finally I found some time available when I could do the job without
disrupting my users.

It turned out to be both the permissions issue as discussed here, and
the fact that the slurm.conf needs the fully qualified path of the
prolog script.

So that is solved, but sadly my problem is not solved as I will
describe in another thread.

On Sun, Sep 18, 2022 at 11:57 PM Bjørn-Helge Mevik
 wrote:
>
> Davide DelVento  writes:
>
> >> I'm curious: What kind of disruption did it cause for your production
> >> jobs?
> >
> > All jobs failed and went in pending/held with "launch failed requeued
> > held" status, all nodes where the jobs were scheduled went draining.
> >
> > The logs only said "error: validate_node_specs: Prolog or job env
> > setup failure on node , draining the node". I guess if they said
> > "-bash: /path/to/prolog: Permission denied" I would have caught the
> > problem myself.
>
> But that is not a problem caused by having things like
>
> exec &> /root/prolog_slurmd.$$
>
> in the script, as you indicated.  It is a problem caused by the prolog
> script file not being executable.
>
> > In hindsight it is obvious, but I don't think even the documentation
> > mentions that, does it? After all you can execute a file with a
> > non-executable with with "sh filename", so I made the incorrect
> > assumption that slurm would have invoked the prolog that way.
>
> Slurm prologs can be written in any language - we used to have perl
> prolog scripts. :)
>
> --
> Regards,
> Bjørn-Helge Mevik, dr. scient,
> Department for Research Computing, University of Oslo
>

Re: [slurm-users] Check consistency

2022-10-12 Thread Davide DelVento

Thanks. I don't see anything wrong from that log.


On Fri, Oct 7, 2022 at 7:32 AM Paul Edmon  wrote:
>
> The slurmctld log will print out if hosts are out of sync with the
> slurmctld slurm.conf.  That said it doesn't report on cgroup consistency
> changes like that.  It's possible that dialing up the verbosity on the
> slurmd logs may give that info but I haven't seen it in normal operating.
>
> -Paul Edmon-
>
> On 10/6/22 5:47 PM, Davide DelVento wrote:
> > Is there a simple way to check that whas slurm is running is what the
> > config say it should be?
> >
> > For example, my understanding is that changing cgroup.conf should be
> > followed by 'systemctl stop slurmd' on all compute nodes, then
> > 'systemctl restart slurmctld' on the head node, then 'systemctl start
> > slurmd' on the compute nodes.
> >
> > Assuming this is correct, is there a way to query the nodes and ask if
> > they are indeed running what the config is saying (or alternatively
> > have them dump their config files somewhere for me to manually run a
> > diff on)?
> >
> > Thanks,
> > Davide
> >
>

[slurm-users] Check consistency

2022-10-06 Thread Davide DelVento

Is there a simple way to check that whas slurm is running is what the
config say it should be?

For example, my understanding is that changing cgroup.conf should be
followed by 'systemctl stop slurmd' on all compute nodes, then
'systemctl restart slurmctld' on the head node, then 'systemctl start
slurmd' on the compute nodes.

Assuming this is correct, is there a way to query the nodes and ask if
they are indeed running what the config is saying (or alternatively
have them dump their config files somewhere for me to manually run a
diff on)?

Thanks,
Davide

Re: [slurm-users] X11 forwarding, slurm-22.05.3, hostbased auth

2022-10-06 Thread Davide DelVento

Perhaps just a very trivial question, but it doesn't look you
mentioned it: does your X-forwarding work from the login node? Maybe
the X-server on your client is the problem and trying xclock on the
login node would clarify that

On Wed, Oct 5, 2022 at 12:03 PM Allan Streib  wrote:
>
> Hi everyone,
>
> I'm trying to get X11 forwarding working on my cluster. I've read some
> of the threads and web posts on X11 forwarding and most of the common
> issues I'm finding seem to pertain to older versions of Slurm.
>
> I log in from my workstation to the login node with ssh -X. I have x11
> apps installed on a test compute node, j-096. Here is what I see:
>
> From the config.log when I built slurm:
>
> $ grep X11 config.log
> configure:19906: checking whether Slurm internal X11 support is enabled
> | #define WITH_SLURM_X11 1
> | #define WITH_SLURM_X11 1
> | #define WITH_SLURM_X11 1
> #define WITH_SLURM_X11 1
>
>
> From the login node:
>
> $ scontrol show config | grep X11
> PrologFlags = Alloc,Contain,X11
> X11Parameters   = home_xauthority
>
> $ grep ^X11 /etc/ssh/sshd_config
> X11Forwarding yes
> X11UseLocalhost no
>
>
> Here is what I see when I try to run "xclock" on my test node:
>
> $ srun --x11 -w j-096 xclock
> Error: Can't open display: localhost:64.0
> srun: error: j-096: task 0: Exited with exit code 1
>
>
> From the sshd_config on the test node:
>
> $ grep ^X11 /etc/ssh/sshd_config
> X11Forwarding yes
>
> We are using hostbased ssh authentication in this cluster.
>
> From the slurmd.log on the test node:
>
> [2022-10-05T13:29:51.065] [2822.extern] X11 forwarding established on 
> DISPLAY=j-096:64.0
> [2022-10-05T13:29:51.165] launch task StepId=2822.0 request from UID:8348 
> GID:100 HOST:172.16.100.132 PORT:58948
> [2022-10-05T13:29:51.165] task/affinity: lllp_distribution: JobId=2822 
> auto binding off: mask_cpu
> [2022-10-05T13:29:51.311] [2822.extern] error: _x11_socket_read: 
> slurm_open_msg_conn(127.0.0.1:34811): Connection refused
> [2022-10-05T13:29:51.330] [2822.0] done with job
> [2022-10-05T13:29:51.346] [2822.extern] done with job
> [2022-10-05T13:29:51.436] [2822.extern] x11 forwarding shutdown complete
>
> Is the issue the two different DISPLAY values, i.e. j-096:64.0
> vs. localhost:64.0. Not sure how/where to reconcile these? I have tried
> with and without "X11UseLocalhost no" on the login node.
>
> Best wishes,
>
> Allan
>

Re: [slurm-users] Detecting non-MPI jobs running on multiple nodes

2022-09-29 Thread Davide DelVento

At my previous job there were cron jobs running everywhere measuring
possibly idle cores which were eventually averaged out for the
duration of the job, and reported (the day after) via email to the
user support team.
I believe they stopped doing so when compute became (relatively) cheap
at the expense of memory and I/O becoming expensive.

I know, it does not help you much, but perhaps something to think about

On Thu, Sep 29, 2022 at 1:29 AM Loris Bennett
 wrote:
>
> Hi,
>
> Has anyone already come up with a good way to identify non-MPI jobs which
> request multiple cores but don't restrict themselves to a single node,
> leaving cores idle on all but the first node?
>
> I can see that this is potentially not easy, since an MPI job might have
> still have phases where only one core is actually being used.
>
> Cheers,
>
> Loris
>
> --
> Dr. Loris Bennett (Herr/Mr)
> ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de
>

Re: [slurm-users] slurm jobs and and amount of licenses (matlab)

2022-09-26 Thread Davide DelVento

Are your licenses used only for the slurm cluster(s) or are they
shared with laptops, workstations and/or other computing equipment not
managed by slurm?
In the former case, the "local" licenses described in the
documentation will do the trick (but slurm does not automatically
enforce their use, so either strong user education is needed, or
further scripting). In the latter case, more work is needed. See my
other thread on this topic two weeks ago, which I plan to pick up
later this week.

On Mon, Sep 26, 2022 at 5:07 AM Josef Dvoracek  wrote:
>
> hello @list!
>
> anyone who was dealing with following scenario?
>
> * we have limited amount of Matlab network licenses ( and various
> features have various amount of available seats, eg. machine learning: N
> licenses, Image_Toolbox: M licenses)
> * licenses are being used by slurm jobs and by individual users directly
> at their workstations (workstations are not under my control)
>
> Sometimes it happens, that licenses for certain feature, used in
> particular slurm job is already fully consumed, and job fails.
>
> Is there any straightforward trick how to deal with that? Other than
> buying dedicated pool of licenses for our slurm-based data processing
> facility?
>
> EG. let slurm job wait, until there is required license available?
>
> cheers
>
> josef
>
>
>
>

Re: [slurm-users] remote license

2022-09-16 Thread Davide DelVento

Hi Brian,

>From your response, I speculate that my wording sounded harsh or
unrespectful. That was not my intention and therefore I sincerely
apologize for it.

In fact my perplexity is certainly due to my ignorance (as it must be
very clear by the number and "quality" of queries that I am posting on
this mailing list). It seemed to me that what is currently available
is a special edge case, whereas the simpler one is not covered, so I
was (perhaps still am) convinced that I must have misunderstood how
things work. Perhaps my "am I missing something?" sounded rhetorical
rather than sincere which was the spirit I wrote it with. Sorry about
that.

I find it a nice coincidence that you suggested paying SchedMD for
this, because after the clarifications which I am trying to get in
this thread, I thought to ask my management to do exactly that!!!

For the local license, are you suggesting to programmatically change
slurm.conf and reconfigure e.g. in a cron?

Thanks a lot for your help and have a great weekend

Davide






On Fri, Sep 16, 2022 at 11:52 AM Brian Andrus  wrote:
>
> Davide,
>
> I'll not engage on this. If you want a feature, pay SchedMD for support
> and they will prioritize it and work on it.  You are already using a
> very impressive bit of software for free.
>
> As far as local license updates, yes, you can do the local license and
> reconfigure regularly. Feel free to do that. It is not something that
> scales well, but it looks like you have a rather beginner cluster that
> would never be impacted by such choices.
>
> Brian Andrus
>
>
> On 9/16/2022 10:00 AM, Davide DelVento wrote:
> > Thanks Brian.
> >
> > I am still perplexed. What is a database to install, administer,
> > patch, update, could break, be down, etc buying us? I see limited use
> > cases, e.g. a license server which does not provide the license
> > count/use in a parsable way, and that someone wants to use with
> > multiple SLURM installations (if it's on a single one, the local
> > license is perfect). Wouldn't it much, much easier for everybody if
> > one could specify a script (the bullet 1. you mentioned) inside SLURM,
> > and use the license server ITSELF as the authoritative source of
> > license count? Sure, it won't be perfect, e.g. race conditions in
> > license acquisition can still cause failures, but the database won't
> > be fixing that
> > I must be missing something
> >
> > Alternatively, can one update the license count of local license with
> > a scontrol command, rather than changing the slurm.conf and
> > reconfigure? That could make what I say possible
> >
> > Thanks
> >
> > On Fri, Sep 16, 2022 at 9:25 AM Brian Andrus  wrote:
> >> Davide,
> >>
> >> You have it pretty correct. While the database itself is not part of the
> >> slurm suite, slurmdbd (which would access the database) is.
> >>
> >> As far as writing something that keeps things updated, I'm sure many
> >> have done this. However, it would be unique to your installation. The
> >> specific number of licenses, naming them, what license server is being
> >> used, etc.
> >> All of that could easily be a few lines in a script that you have in a
> >> cron job or other trigger (eg prolog/epilog). You would just:
> >>
> >> 1) Read/parse current licenses/use (eg: if you are using flexlm, lmutil
> >> lmstat output)
> >> 2) Update the database (sacctmgr command)
> >>
> >> As you can see, that 1st step would be highly dependent on you and your
> >> environment. The 2nd step would be dependent on what things you are
> >> tracking within that.
> >>
> >> Brian Andrus
> >>
> >>
> >> On 9/16/2022 5:01 AM, Davide DelVento wrote:
> >>> So if I understand correctly, this "remote database" is something that
> >>> is neither part of slurm itself, nor part of the license server per
> >>> se, correct?
> >>>
> >>> Regarding the "if you got creative", has anybody on this list done
> >>> that already? I can't believe I'm the first one wanting this feature!
> >>> Matching the number in that database with the actual number the
> >>> license server knows would be extremely helpful! We use various
> >>> license servers (for various licensed software), so each one of them
> >>> would be useful. I can probably script/develop one of these myself,
> >>> but I am not sure I've got the time...
> >>>
> >>> Thanks!
> >>>
> >>> On Thu, Sep 15, 2022 at 6:04 PM Brian

Re: [slurm-users] remote license

2022-09-16 Thread Davide DelVento

Thanks Brian.

I am still perplexed. What is a database to install, administer,
patch, update, could break, be down, etc buying us? I see limited use
cases, e.g. a license server which does not provide the license
count/use in a parsable way, and that someone wants to use with
multiple SLURM installations (if it's on a single one, the local
license is perfect). Wouldn't it much, much easier for everybody if
one could specify a script (the bullet 1. you mentioned) inside SLURM,
and use the license server ITSELF as the authoritative source of
license count? Sure, it won't be perfect, e.g. race conditions in
license acquisition can still cause failures, but the database won't
be fixing that
I must be missing something

Alternatively, can one update the license count of local license with
a scontrol command, rather than changing the slurm.conf and
reconfigure? That could make what I say possible

Thanks

On Fri, Sep 16, 2022 at 9:25 AM Brian Andrus  wrote:
>
> Davide,
>
> You have it pretty correct. While the database itself is not part of the
> slurm suite, slurmdbd (which would access the database) is.
>
> As far as writing something that keeps things updated, I'm sure many
> have done this. However, it would be unique to your installation. The
> specific number of licenses, naming them, what license server is being
> used, etc.
> All of that could easily be a few lines in a script that you have in a
> cron job or other trigger (eg prolog/epilog). You would just:
>
> 1) Read/parse current licenses/use (eg: if you are using flexlm, lmutil
> lmstat output)
> 2) Update the database (sacctmgr command)
>
> As you can see, that 1st step would be highly dependent on you and your
> environment. The 2nd step would be dependent on what things you are
> tracking within that.
>
> Brian Andrus
>
>
> On 9/16/2022 5:01 AM, Davide DelVento wrote:
> > So if I understand correctly, this "remote database" is something that
> > is neither part of slurm itself, nor part of the license server per
> > se, correct?
> >
> > Regarding the "if you got creative", has anybody on this list done
> > that already? I can't believe I'm the first one wanting this feature!
> > Matching the number in that database with the actual number the
> > license server knows would be extremely helpful! We use various
> > license servers (for various licensed software), so each one of them
> > would be useful. I can probably script/develop one of these myself,
> > but I am not sure I've got the time...
> >
> > Thanks!
> >
> > On Thu, Sep 15, 2022 at 6:04 PM Brian Andrus  wrote:
> >> So if you follow the links to: https://slurm.schedmd.com/licenses.html
> >> you should see the difference.
> >>
> >> Local licenses are just a counter that is setup in slurm.conf
> >> Remote liceneses are a counter in a database (the database is "remote"),
> >> so you can change/update it dynamically. So, you could change their
> >> allocation with a sacctmgr command. It is especially useful when you are
> >> managing multiple clusters that share licenses. You can allocate that a
> >> certain number are allowed by each cluster and change that if needed.
> >>
> >> If you got creative, you could keep the license count that is in the
> >> database updated to match the number free from flexlm to stop license
> >> starvation due to users outside slurm using them up so they really
> >> aren't available to slurm.
> >>
> >> Brian Andrus
> >>
> >>
> >> On 9/15/2022 3:34 PM, Davide DelVento wrote:
> >>> I am a bit confused by remote licenses.
> >>>
> >>> https://lists.schedmd.com/pipermail/slurm-users/2020-September/006049.html
> >>> (which is only 2 years old) claims that they are just a counter, so
> >>> like local licenses. Then why call them remote?
> >>>
> >>> Only a few days after, this
> >>> https://lists.schedmd.com/pipermail/slurm-users/2020-September/006081.html
> >>> appeared to imply (but not clearly stated) that the remote license are
> >>> not simply a counter, but then it's not clear how they are different.
> >>>
> >>> The current documentation (and attempts to run the "add resource"
> >>> command) says that one must use the license count, which seems to
> >>> imply they are just a simple counter (but then what do they need the
> >>> server for)?
> >>>
> >>> So what is what?
> >>>
> >>> In my cursory past experience with this, it seemed that it were
> >>> possible to query a license server (at least some of them) to get the
> >>> actual number of available licenses and schedule (or let jobs pending)
> >>> accordingly. Which would be very helpful for the not-too-uncommon
> >>> situation in which the same license server provides licenses for both
> >>> the HPC cluster and other non-slurm-controlled resources, such a
> >>> user's workstations. Was that impression wrong, or perhaps somebody
> >>> scripted it in some way? If the latter, does anybody know if those
> >>> scripts are publicly available anywhere?
> >>>
> >>> Thanks
> >>>
>

Re: [slurm-users] How to debug a prolog script?

2022-09-16 Thread Davide DelVento

Thanks a lot.

> > Does it need the execution permission? For root alone sufficient?
>
> slurmd runs as root, so it only need exec perms for root.

Perfect. That must have been then, since my script (like the example
one) did not have the execution permission on.

> I'm curious: What kind of disruption did it cause for your production
> jobs?

All jobs failed and went in pending/held with "launch failed requeued
held" status, all nodes where the jobs were scheduled went draining.

The logs only said "error: validate_node_specs: Prolog or job env
setup failure on node , draining the node". I guess if they said
"-bash: /path/to/prolog: Permission denied" I would have caught the
problem myself.

In hindsight it is obvious, but I don't think even the documentation
mentions that, does it? After all you can execute a file with a
non-executable with with "sh filename", so I made the incorrect
assumption that slurm would have invoked the prolog that way.

Thanks!

Re: [slurm-users] How to debug a prolog script?

2022-09-16 Thread Davide DelVento

Thanks to both of you.

> Permissions on the file itself (and the directories in the path to it)

Does it need the execution permission? For root alone sufficient?

> Existence of the script on the nodes (prologue is run on the nodes, not the 
> head)

Yes, it's in a shared filesystem.

> Not sure your error is the prologue script itself. Does everything run fine 
> with no prologue configured?

Yes, everything has been working fine for months and still does as
soon as I take the prolog out of slurm.conf.

> > 2. How to debug the issue?
> I'd try capturing all stdout and stderr from the script into a file on the 
> compute
> node, for instance like this:
>
> exec &> /root/prolog_slurmd.$$
> set -x # To print out all commands

Do you mean INSIDE the prologue script itself? Yes, this is what I'd
have done, if it weren't so disruptive of all my production jobs,
hence I had to turn it off before wrecking havoc too much.


> > Even increasing the debug level the
> > slurmctld.log contains simply a "error: validate_node_specs: Prolog or
> > job env setup failure on node xxx, draining the node" message, without
> > even a line number or anything.
>
> Slurm only executes the prolog script.  It doesn't parse it or evaluate
> it itself, so it has no way of knowing what fails inside the script.

Sure, but even "just executing" there is stdout and stderr which could
be captured and logged rather than thrown away and force one to do the
above.

> > 3. And more generally, how to debug a prolog (and epilog) script
> > without disrupting all production jobs? Unfortunately we can't have
> > another slurm install for testing, is there a sbatch option to force
> > utilizing a prolog script which would not be executed for all the
> > other jobs? Or perhaps making a dedicated queue?
>
> I tend to reserve a node, install the updated prolog scripts there, and
> run test jobs asking for that reservation.

How do you "install the prolog scripts there"? Isn't the prolog
setting in slurm.conf global?

> (Otherwise one could always
> set up a small cluster of VMs and use that for simpler testing.)

Yes, but I need to request that cluster of VM to IT, have the same OS
installed and configured (and to be 100% identical, it needs to be
RHEL so license paid), and everything sync'ed with the actual
cluster I know it'd be very useful, but sadly we don't have the
resources to do that, so unfortunately this is not an option for me.

Thanks again.

Re: [slurm-users] remote license

2022-09-16 Thread Davide DelVento

So if I understand correctly, this "remote database" is something that
is neither part of slurm itself, nor part of the license server per
se, correct?

Regarding the "if you got creative", has anybody on this list done
that already? I can't believe I'm the first one wanting this feature!
Matching the number in that database with the actual number the
license server knows would be extremely helpful! We use various
license servers (for various licensed software), so each one of them
would be useful. I can probably script/develop one of these myself,
but I am not sure I've got the time...

Thanks!

On Thu, Sep 15, 2022 at 6:04 PM Brian Andrus  wrote:
>
> So if you follow the links to: https://slurm.schedmd.com/licenses.html
> you should see the difference.
>
> Local licenses are just a counter that is setup in slurm.conf
> Remote liceneses are a counter in a database (the database is "remote"),
> so you can change/update it dynamically. So, you could change their
> allocation with a sacctmgr command. It is especially useful when you are
> managing multiple clusters that share licenses. You can allocate that a
> certain number are allowed by each cluster and change that if needed.
>
> If you got creative, you could keep the license count that is in the
> database updated to match the number free from flexlm to stop license
> starvation due to users outside slurm using them up so they really
> aren't available to slurm.
>
> Brian Andrus
>
>
> On 9/15/2022 3:34 PM, Davide DelVento wrote:
> > I am a bit confused by remote licenses.
> >
> > https://lists.schedmd.com/pipermail/slurm-users/2020-September/006049.html
> > (which is only 2 years old) claims that they are just a counter, so
> > like local licenses. Then why call them remote?
> >
> > Only a few days after, this
> > https://lists.schedmd.com/pipermail/slurm-users/2020-September/006081.html
> > appeared to imply (but not clearly stated) that the remote license are
> > not simply a counter, but then it's not clear how they are different.
> >
> > The current documentation (and attempts to run the "add resource"
> > command) says that one must use the license count, which seems to
> > imply they are just a simple counter (but then what do they need the
> > server for)?
> >
> > So what is what?
> >
> > In my cursory past experience with this, it seemed that it were
> > possible to query a license server (at least some of them) to get the
> > actual number of available licenses and schedule (or let jobs pending)
> > accordingly. Which would be very helpful for the not-too-uncommon
> > situation in which the same license server provides licenses for both
> > the HPC cluster and other non-slurm-controlled resources, such a
> > user's workstations. Was that impression wrong, or perhaps somebody
> > scripted it in some way? If the latter, does anybody know if those
> > scripts are publicly available anywhere?
> >
> > Thanks
> >
>

[slurm-users] remote license

2022-09-15 Thread Davide DelVento

I am a bit confused by remote licenses.

https://lists.schedmd.com/pipermail/slurm-users/2020-September/006049.html
(which is only 2 years old) claims that they are just a counter, so
like local licenses. Then why call them remote?

Only a few days after, this
https://lists.schedmd.com/pipermail/slurm-users/2020-September/006081.html
appeared to imply (but not clearly stated) that the remote license are
not simply a counter, but then it's not clear how they are different.

The current documentation (and attempts to run the "add resource"
command) says that one must use the license count, which seems to
imply they are just a simple counter (but then what do they need the
server for)?

So what is what?

In my cursory past experience with this, it seemed that it were
possible to query a license server (at least some of them) to get the
actual number of available licenses and schedule (or let jobs pending)
accordingly. Which would be very helpful for the not-too-uncommon
situation in which the same license server provides licenses for both
the HPC cluster and other non-slurm-controlled resources, such a
user's workstations. Was that impression wrong, or perhaps somebody
scripted it in some way? If the latter, does anybody know if those
scripts are publicly available anywhere?

Thanks

[slurm-users] How to debug a prolog script?

2022-09-15 Thread Davide DelVento

I have a super simple prolog script, as follows (very similar to the
example one)

#!/bin/bash

if [[ $VAR == 1 ]]; then
echo "True"
fi

exit 0

This fails (and obviously causes great disruption to my production
jobs). So I have two questions:

1. Why does it fail? It does so regardless of the value of the
variable, so it must not be the echo not being in the PATH (note that
[[ is a shell keyword). I understand that the echo command will go in
a black hole and I should use "print ..." (not sure about its syntax,
and the documentation is very cryptic, but I digress) or perhaps
logger (as the example does), and I tried some of them with no luck.

2. How to debug the issue? Even increasing the debug level the
slurmctld.log contains simply a "error: validate_node_specs: Prolog or
job env setup failure on node xxx, draining the node" message, without
even a line number or anything. Google does not return anything useful
about this message

3. And more generally, how to debug a prolog (and epilog) script
without disrupting all production jobs? Unfortunately we can't have
another slurm install for testing, is there a sbatch option to force
utilizing a prolog script which would not be executed for all the
other jobs? Or perhaps making a dedicated queue?

[slurm-users] SUG22?

2022-09-14 Thread Davide DelVento

Does anybody know anything about the SUG22 announced
https://lists.schedmd.com/pipermail/slurm-announce/2022/82.html
for next week?

https://schedmd.com/events.php and https://schedmd.com/news.php do not
mention anything about it

Re: [slurm-users] License management and invoking scontrol in the prolog

2022-09-12 Thread Davide DelVento

For other poor souls coming to this conversation, here is the conclusion.

$ sbatch --version
slurm 21.08.5

$ # irrelevant parts omitted from copy-paste for brevity
$ cat /opt/slurm/job_submit.lua
log_prefix = 'slurm_job_submit'
function slurm_job_submit(job_desc, part_list, submit_uid)
slurm.log_info("LOOPING")
for key, value in pairs(job_desc) do
slurm.log_info("%s: key=%s value=%s", log_prefix, key, value)
end
slurm.log_info("END LOOPING")
slurm.log_info("%s: user %s(%u) job_name=%s", log_prefix,
job_desc.user_name, submit_uid, job_desc.name)
return slurm.SUCCESS
end

$ sudo tail /var/log/slurm/slurmctld.log
[2022-09-12T08:13:16.310] _slurm_rpc_submit_batch_job: JobId=16028
InitPrio=4294887729 usec=198
[2022-09-12T08:13:17.213] sched: Allocate JobId=16028
NodeList=co49svnode13 #CPUs=16 Partition=compute512
[2022-09-12T08:15:01.090] lua: LOOPING
[2022-09-12T08:15:01.090] lua: END LOOPING
[2022-09-12T08:15:01.090] lua: slurm_job_submit: user
ddvento(57002254) job_name=lic.slurm
[2022-09-12T08:15:01.091] _slurm_rpc_submit_batch_job: JobId=16029
InitPrio=4294887728 usec=204
[2022-09-12T08:15:01.289] sched/backfill: _start_job: Started
JobId=16029 in compute512 on co49svnode14
[2022-09-12T08:15:17.330] _job_complete: JobId=16028 WEXITSTATUS 0
[2022-09-12T08:15:17.330] _job_complete: JobId=16028 done

Note how the looping is ended. This is probably just my naiveness and
ignorance with how lua work.

In conclusion, I am now able to see job_desc.licenses for example with:

slurm.log_info("%s: licenses=%s", log_prefix, job_desc.licenses)

So now I need to just implement my logic.

Thank you everybody in this conversation for helping to sort out this
issue. Greatly appreciated!

On Thu, Sep 8, 2022 at 9:22 AM Davide DelVento  wrote:
>
> Thanks Ole, for this clarification, this is very good to know.
>
> However, the problem is that the very example provided by slurm itself
> is the one that has the error. I removed the unpack part with the
> variable arguments and that fixed that part.
>
> Unfortunately, the job_desc table is always empty so the whole
> job_submit.lua seems like a moot point? Or the example is so outdated
> (given that it cannot even log correctly) that this is now performed
> in a different way??
> Davide
>
> On Thu, Sep 8, 2022 at 12:23 AM Ole Holm Nielsen
>  wrote:
> >
> > Hi Davide,
> >
> > In your slurmctld log you see an entry "error: job_submit/lua:
> > /opt/slurm/job_submit.lua".
> >
> > What I think happens is that when slurmctld encounters an error in
> > job_submit.lua, it will revert to the last known good script cached by
> > slurmctld and ignore the file on disk from now on, even if it has been
> > corrected.  An "scontrol reconfig" may make slurmctld reread the
> > job_submit.lua, please try it.
> >
> > I believe that this slurmctld behavior is undocumented at present.  Please
> > see https://bugs.schedmd.com/show_bug.cgi?id=14472#c15 for a description:
> >
> > > And, if after the reconfigure, the job_submit.lua is wrong formatted (or 
> > > missing), it will use the previous version of the script (which we have 
> > > stored backup previously):
> >
> > /Ole
> >
> >
> > On 9/7/22 14:21, Davide DelVento wrote:
> > > Thanks Ole, your wiki page sheds some light on this mystery.
> > > Very frustrating that even the simple example provided in the release
> > > fails, and it fails at the most basic logging functionality.
> > >
> > > Note that "my" job_submit.lua is now the unmodified, slurm-provided
> > > one and that the luac command returns nothing in my case (this is
> > > Lua 5.3.4) so syntax seems correct?
> > >
> > > Yet the logs report the problem I mentioned rather than the actual
> > > content that the plugin is attempting to log.
> > >
> > > On Wed, Sep 7, 2022 at 2:13 AM Ole Holm Nielsen
> > >  wrote:
> > >>
> > >> Hi Davide,
> > >>
> > >> I suggest that you check your job_submit.lua script with the LUA 
> > >> compiler:
> > >>
> > >> luac -p /etc/slurm/job_submit.lua
> > >>
> > >> I have written some more details in my Wiki page
> > >> https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#job-submit-plugins
> > >>
> > >> Best regards,
> > >> Ole
> > >>
> > >> On 9/7/22 01:51, Davide DelVento wrote:
> > >>> Thanks again to both of you.
> > >>>
> > >>> I actually did not build Slurm myself, otherwise I'

Re: [slurm-users] License management and invoking scontrol in the prolog

2022-09-08 Thread Davide DelVento

Thanks Ole, for this clarification, this is very good to know.

However, the problem is that the very example provided by slurm itself
is the one that has the error. I removed the unpack part with the
variable arguments and that fixed that part.

Unfortunately, the job_desc table is always empty so the whole
job_submit.lua seems like a moot point? Or the example is so outdated
(given that it cannot even log correctly) that this is now performed
in a different way??
Davide

On Thu, Sep 8, 2022 at 12:23 AM Ole Holm Nielsen
 wrote:
>
> Hi Davide,
>
> In your slurmctld log you see an entry "error: job_submit/lua:
> /opt/slurm/job_submit.lua".
>
> What I think happens is that when slurmctld encounters an error in
> job_submit.lua, it will revert to the last known good script cached by
> slurmctld and ignore the file on disk from now on, even if it has been
> corrected.  An "scontrol reconfig" may make slurmctld reread the
> job_submit.lua, please try it.
>
> I believe that this slurmctld behavior is undocumented at present.  Please
> see https://bugs.schedmd.com/show_bug.cgi?id=14472#c15 for a description:
>
> > And, if after the reconfigure, the job_submit.lua is wrong formatted (or 
> > missing), it will use the previous version of the script (which we have 
> > stored backup previously):
>
> /Ole
>
>
> On 9/7/22 14:21, Davide DelVento wrote:
> > Thanks Ole, your wiki page sheds some light on this mystery.
> > Very frustrating that even the simple example provided in the release
> > fails, and it fails at the most basic logging functionality.
> >
> > Note that "my" job_submit.lua is now the unmodified, slurm-provided
> > one and that the luac command returns nothing in my case (this is
> > Lua 5.3.4) so syntax seems correct?
> >
> > Yet the logs report the problem I mentioned rather than the actual
> > content that the plugin is attempting to log.
> >
> > On Wed, Sep 7, 2022 at 2:13 AM Ole Holm Nielsen
> >  wrote:
> >>
> >> Hi Davide,
> >>
> >> I suggest that you check your job_submit.lua script with the LUA compiler:
> >>
> >> luac -p /etc/slurm/job_submit.lua
> >>
> >> I have written some more details in my Wiki page
> >> https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#job-submit-plugins
> >>
> >> Best regards,
> >> Ole
> >>
> >> On 9/7/22 01:51, Davide DelVento wrote:
> >>> Thanks again to both of you.
> >>>
> >>> I actually did not build Slurm myself, otherwise I'd keep extensive
> >>> logs of what I did. Other people did, so I don't know. However, I get
> >>> the same grep'ing results as yours.
> >>>
> >>> Looking at the logs reveals some info, but it's cryptic.
> >>>
> >>> [2022-09-06T17:33:56.513] debug3: job_submit/lua:
> >>> slurm_lua_loadscript: skipping loading Lua script:
> >>> /opt/slurm/job_submit.lua
> >>> [2022-09-06T17:33:56.513] error: job_submit/lua:
> >>> /opt/slurm/job_submit.lua: [string "slurm.user_msg
> >>> (string.format(table.unpack({...})))"]:1: bad argument #2 to 'format'
> >>> (no value)
> >>>
> >>> As you can see, there is no line number and there is nothing like
> >>> user_msg in this code. There is indeed an "unpack" which is used in
> >>> the SchedMD-defined logging helper function which has a comment
> >>> "Implicit definition of arg was removed in Lua 5.2" and that's where I
> >>> speculate the error occurs.
> >>>
> >>> I should stress, this is with their own example, not my code. I guess
> >>> I could forgo the logging and move forward, but that won't probably
> >>> lead me very far.
> >>>
> >>> I am contemplating submitting a github issue about it? I did check
> >>> that the version of the job_submit.lua I have is the same currently in
> >>> the repo at 
> >>> https://github.com/SchedMD/slurm/blob/master/etc/job_submit.lua.example
> >>>
> >>> On Thu, Sep 1, 2022 at 11:55 PM Ole Holm Nielsen
> >>>  wrote:
> >>>>
> >>>> Did you install all prerequiste packages (including lua) on the server
> >>>> where you built the Slurm packages?
> >>>>
> >>>> On my system I get:
> >>>>
> >>>> $ strings `which slurmctld ` | grep HAVE_LUA
> >>>> HAVE_LUA 1
> >>>>
> >>>> /Ole
> >>>>
> >&

Re: [slurm-users] License management and invoking scontrol in the prolog

2022-09-07 Thread Davide DelVento

No, I never moved that file from the Linux cluster (and I do the
editing with VIM which warns me of that possible issue).

On Wed, Sep 7, 2022 at 11:19 AM Brian Andrus  wrote:
>
> Possibly way off base, but did you happen to do any of the editing in
> Windows? Maybe running into the cr/lf issue for how windows saves text
> files?
>
> Brian Andrus
>
> On 9/7/2022 5:21 AM, Davide DelVento wrote:
> > Thanks Ole, your wiki page sheds some light on this mystery.
> > Very frustrating that even the simple example provided in the release
> > fails, and it fails at the most basic logging functionality.
> >
> > Note that "my" job_submit.lua is now the unmodified, slurm-provided
> > one and that the luac command returns nothing in my case (this is
> > Lua 5.3.4) so syntax seems correct?
> >
> > Yet the logs report the problem I mentioned rather than the actual
> > content that the plugin is attempting to log.
> >
> > On Wed, Sep 7, 2022 at 2:13 AM Ole Holm Nielsen
> >  wrote:
> >> Hi Davide,
> >>
> >> I suggest that you check your job_submit.lua script with the LUA compiler:
> >>
> >> luac -p /etc/slurm/job_submit.lua
> >>
> >> I have written some more details in my Wiki page
> >> https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#job-submit-plugins
> >>
> >> Best regards,
> >> Ole
> >>
> >> On 9/7/22 01:51, Davide DelVento wrote:
> >>> Thanks again to both of you.
> >>>
> >>> I actually did not build Slurm myself, otherwise I'd keep extensive
> >>> logs of what I did. Other people did, so I don't know. However, I get
> >>> the same grep'ing results as yours.
> >>>
> >>> Looking at the logs reveals some info, but it's cryptic.
> >>>
> >>> [2022-09-06T17:33:56.513] debug3: job_submit/lua:
> >>> slurm_lua_loadscript: skipping loading Lua script:
> >>> /opt/slurm/job_submit.lua
> >>> [2022-09-06T17:33:56.513] error: job_submit/lua:
> >>> /opt/slurm/job_submit.lua: [string "slurm.user_msg
> >>> (string.format(table.unpack({...})))"]:1: bad argument #2 to 'format'
> >>> (no value)
> >>>
> >>> As you can see, there is no line number and there is nothing like
> >>> user_msg in this code. There is indeed an "unpack" which is used in
> >>> the SchedMD-defined logging helper function which has a comment
> >>> "Implicit definition of arg was removed in Lua 5.2" and that's where I
> >>> speculate the error occurs.
> >>>
> >>> I should stress, this is with their own example, not my code. I guess
> >>> I could forgo the logging and move forward, but that won't probably
> >>> lead me very far.
> >>>
> >>> I am contemplating submitting a github issue about it? I did check
> >>> that the version of the job_submit.lua I have is the same currently in
> >>> the repo at 
> >>> https://github.com/SchedMD/slurm/blob/master/etc/job_submit.lua.example
> >>>
> >>> On Thu, Sep 1, 2022 at 11:55 PM Ole Holm Nielsen
> >>>  wrote:
> >>>> Did you install all prerequiste packages (including lua) on the server
> >>>> where you built the Slurm packages?
> >>>>
> >>>> On my system I get:
> >>>>
> >>>> $ strings `which slurmctld ` | grep HAVE_LUA
> >>>> HAVE_LUA 1
> >>>>
> >>>> /Ole
> >>>>
> >>>> https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#install-prerequisites
> >>>>
> >>>> On 9/2/22 05:15, Davide DelVento wrote:
> >>>>> Thanks.
> >>>>>
> >>>>> I did try a lua script as soon as I got your first email, but that
> >>>>> never worked (yes, I enabled it in slurm.conf and ran "scontrol
> >>>>> reconfigure" after). Slurm simply acted as if there was no job_submit 
> >>>>> script.
> >>>>>
> >>>>> After various tests, all unsuccessful, today I found that link which I
> >>>>> mentioned saying that lua might not be compiled in, hence all my most
> >>>>> recent messages of this thread.
> >>>>>
> >>>>> That file is indeed there, so that's good news that I don't need to 
> >>>>> recompile.
> >>>>> However I'm puzzled on what might b

Re: [slurm-users] License management and invoking scontrol in the prolog

2022-09-07 Thread Davide DelVento

Thanks Ole, your wiki page sheds some light on this mystery.
Very frustrating that even the simple example provided in the release
fails, and it fails at the most basic logging functionality.

Note that "my" job_submit.lua is now the unmodified, slurm-provided
one and that the luac command returns nothing in my case (this is
Lua 5.3.4) so syntax seems correct?

Yet the logs report the problem I mentioned rather than the actual
content that the plugin is attempting to log.

On Wed, Sep 7, 2022 at 2:13 AM Ole Holm Nielsen
 wrote:
>
> Hi Davide,
>
> I suggest that you check your job_submit.lua script with the LUA compiler:
>
> luac -p /etc/slurm/job_submit.lua
>
> I have written some more details in my Wiki page
> https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#job-submit-plugins
>
> Best regards,
> Ole
>
> On 9/7/22 01:51, Davide DelVento wrote:
> > Thanks again to both of you.
> >
> > I actually did not build Slurm myself, otherwise I'd keep extensive
> > logs of what I did. Other people did, so I don't know. However, I get
> > the same grep'ing results as yours.
> >
> > Looking at the logs reveals some info, but it's cryptic.
> >
> > [2022-09-06T17:33:56.513] debug3: job_submit/lua:
> > slurm_lua_loadscript: skipping loading Lua script:
> > /opt/slurm/job_submit.lua
> > [2022-09-06T17:33:56.513] error: job_submit/lua:
> > /opt/slurm/job_submit.lua: [string "slurm.user_msg
> > (string.format(table.unpack({...})))"]:1: bad argument #2 to 'format'
> > (no value)
> >
> > As you can see, there is no line number and there is nothing like
> > user_msg in this code. There is indeed an "unpack" which is used in
> > the SchedMD-defined logging helper function which has a comment
> > "Implicit definition of arg was removed in Lua 5.2" and that's where I
> > speculate the error occurs.
> >
> > I should stress, this is with their own example, not my code. I guess
> > I could forgo the logging and move forward, but that won't probably
> > lead me very far.
> >
> > I am contemplating submitting a github issue about it? I did check
> > that the version of the job_submit.lua I have is the same currently in
> > the repo at 
> > https://github.com/SchedMD/slurm/blob/master/etc/job_submit.lua.example
> >
> > On Thu, Sep 1, 2022 at 11:55 PM Ole Holm Nielsen
> >  wrote:
> >>
> >> Did you install all prerequiste packages (including lua) on the server
> >> where you built the Slurm packages?
> >>
> >> On my system I get:
> >>
> >> $ strings `which slurmctld ` | grep HAVE_LUA
> >> HAVE_LUA 1
> >>
> >> /Ole
> >>
> >> https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#install-prerequisites
> >>
> >> On 9/2/22 05:15, Davide DelVento wrote:
> >>> Thanks.
> >>>
> >>> I did try a lua script as soon as I got your first email, but that
> >>> never worked (yes, I enabled it in slurm.conf and ran "scontrol
> >>> reconfigure" after). Slurm simply acted as if there was no job_submit 
> >>> script.
> >>>
> >>> After various tests, all unsuccessful, today I found that link which I
> >>> mentioned saying that lua might not be compiled in, hence all my most
> >>> recent messages of this thread.
> >>>
> >>> That file is indeed there, so that's good news that I don't need to 
> >>> recompile.
> >>> However I'm puzzled on what might be missing...
> >>>
> >>>
> >>> On Thu, Sep 1, 2022 at 6:33 PM Brian Andrus  wrote:
> >>>>
> >>>> lua is the language you can use with the job_submit plugin.
> >>>>
> >>>> I was showing a quick way to see that job_submit capability is indeed in
> >>>> there.
> >>>>
> >>>> You can see if lua support is there by looking for the job_submit_lua.so
> >>>> file is there.
> >>>> It would be part of the slurm rpm (not the slurm-slurmctl rpm)
> >>>>
> >>>> Usually it would be found at /usr/lib64/slurm/job_submit_lua.so
> >>>>
> >>>> If that is there, you should be good with trying out a job_submit lua
> >>>> script.
> >>>>
> >>>> Brian Andrus
> >>>>
> >>>> On 9/1/2022 1:24 PM, Davide DelVento wrote:
> >>>>> Thanks again, Brian, indeed that grep returns many hits, but none of
> >>>>> them includes l

Re: [slurm-users] License management and invoking scontrol in the prolog

2022-09-06 Thread Davide DelVento

Thanks again to both of you.

I actually did not build Slurm myself, otherwise I'd keep extensive
logs of what I did. Other people did, so I don't know. However, I get
the same grep'ing results as yours.

Looking at the logs reveals some info, but it's cryptic.

[2022-09-06T17:33:56.513] debug3: job_submit/lua:
slurm_lua_loadscript: skipping loading Lua script:
/opt/slurm/job_submit.lua
[2022-09-06T17:33:56.513] error: job_submit/lua:
/opt/slurm/job_submit.lua: [string "slurm.user_msg
(string.format(table.unpack({...})))"]:1: bad argument #2 to 'format'
(no value)

As you can see, there is no line number and there is nothing like
user_msg in this code. There is indeed an "unpack" which is used in
the SchedMD-defined logging helper function which has a comment
"Implicit definition of arg was removed in Lua 5.2" and that's where I
speculate the error occurs.

I should stress, this is with their own example, not my code. I guess
I could forgo the logging and move forward, but that won't probably
lead me very far.

I am contemplating submitting a github issue about it? I did check
that the version of the job_submit.lua I have is the same currently in
the repo at 
https://github.com/SchedMD/slurm/blob/master/etc/job_submit.lua.example

On Thu, Sep 1, 2022 at 11:55 PM Ole Holm Nielsen
 wrote:
>
> Did you install all prerequiste packages (including lua) on the server
> where you built the Slurm packages?
>
> On my system I get:
>
> $ strings `which slurmctld ` | grep HAVE_LUA
> HAVE_LUA 1
>
> /Ole
>
> https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#install-prerequisites
>
> On 9/2/22 05:15, Davide DelVento wrote:
> > Thanks.
> >
> > I did try a lua script as soon as I got your first email, but that
> > never worked (yes, I enabled it in slurm.conf and ran "scontrol
> > reconfigure" after). Slurm simply acted as if there was no job_submit 
> > script.
> >
> > After various tests, all unsuccessful, today I found that link which I
> > mentioned saying that lua might not be compiled in, hence all my most
> > recent messages of this thread.
> >
> > That file is indeed there, so that's good news that I don't need to 
> > recompile.
> > However I'm puzzled on what might be missing...
> >
> >
> > On Thu, Sep 1, 2022 at 6:33 PM Brian Andrus  wrote:
> >>
> >> lua is the language you can use with the job_submit plugin.
> >>
> >> I was showing a quick way to see that job_submit capability is indeed in
> >> there.
> >>
> >> You can see if lua support is there by looking for the job_submit_lua.so
> >> file is there.
> >> It would be part of the slurm rpm (not the slurm-slurmctl rpm)
> >>
> >> Usually it would be found at /usr/lib64/slurm/job_submit_lua.so
> >>
> >> If that is there, you should be good with trying out a job_submit lua
> >> script.
> >>
> >> Brian Andrus
> >>
> >> On 9/1/2022 1:24 PM, Davide DelVento wrote:
> >>> Thanks again, Brian, indeed that grep returns many hits, but none of
> >>> them includes lua, i.e.
> >>>
> >>>strings `which slurmctld ` | grep -i job_submit | grep -i lua
> >>>
> >>> returns nothing. So I should use the C rather than the more convenient
> >>> lua interface, unless I recompile or am I missing something?
> >>>
> >>> On Thu, Sep 1, 2022 at 12:30 PM Brian Andrus  wrote:
> >>>> I would be surprised if it were compiled without the support. However,
> >>>> you could check and run something like:
> >>>>
> >>>> strings /sbin/slurmctld | grep job_submit
> >>>>
> >>>> (or where ever your slurmctld binary is). There should be quite a few
> >>>> lines with that in it.
> >>>>
> >>>> Brian Andrus
> >>>>
> >>>> On 9/1/2022 10:54 AM, Davide DelVento wrote:
> >>>>> Thanks Brian for the suggestion, which I am now exploring.
> >>>>>
> >>>>> The documentation is a bit cryptic for me, but exploring a few things
> >>>>> and checking 
> >>>>> https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/
> >>>>> I suspect my slurm install (provided by cluster vendor) was not
> >>>>> compiled with the lua plugin installed. Do you know how to verify if
> >>>>> that is the case or if it's something else? I don't see a way to show
> >>>>> if the plugin is actually being "seen" by slurm, and I suspect i

Re: [slurm-users] License management and invoking scontrol in the prolog

2022-09-01 Thread Davide DelVento

Thanks.

I did try a lua script as soon as I got your first email, but that
never worked (yes, I enabled it in slurm.conf and ran "scontrol
reconfigure" after). Slurm simply acted as if there was no job_submit script.

After various tests, all unsuccessful, today I found that link which I
mentioned saying that lua might not be compiled in, hence all my most
recent messages of this thread.

That file is indeed there, so that's good news that I don't need to recompile.
However I'm puzzled on what might be missing...


On Thu, Sep 1, 2022 at 6:33 PM Brian Andrus  wrote:
>
> lua is the language you can use with the job_submit plugin.
>
> I was showing a quick way to see that job_submit capability is indeed in
> there.
>
> You can see if lua support is there by looking for the job_submit_lua.so
> file is there.
> It would be part of the slurm rpm (not the slurm-slurmctl rpm)
>
> Usually it would be found at /usr/lib64/slurm/job_submit_lua.so
>
> If that is there, you should be good with trying out a job_submit lua
> script.
>
> Brian Andrus
>
> On 9/1/2022 1:24 PM, Davide DelVento wrote:
> > Thanks again, Brian, indeed that grep returns many hits, but none of
> > them includes lua, i.e.
> >
> >   strings `which slurmctld ` | grep -i job_submit | grep -i lua
> >
> > returns nothing. So I should use the C rather than the more convenient
> > lua interface, unless I recompile or am I missing something?
> >
> > On Thu, Sep 1, 2022 at 12:30 PM Brian Andrus  wrote:
> >> I would be surprised if it were compiled without the support. However,
> >> you could check and run something like:
> >>
> >> strings /sbin/slurmctld | grep job_submit
> >>
> >> (or where ever your slurmctld binary is). There should be quite a few
> >> lines with that in it.
> >>
> >> Brian Andrus
> >>
> >> On 9/1/2022 10:54 AM, Davide DelVento wrote:
> >>> Thanks Brian for the suggestion, which I am now exploring.
> >>>
> >>> The documentation is a bit cryptic for me, but exploring a few things
> >>> and checking 
> >>> https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/
> >>> I suspect my slurm install (provided by cluster vendor) was not
> >>> compiled with the lua plugin installed. Do you know how to verify if
> >>> that is the case or if it's something else? I don't see a way to show
> >>> if the plugin is actually being "seen" by slurm, and I suspect it's
> >>> not.
> >>>
> >>> Does anyone else have other suggestions or comment on either the
> >>> plugin or the prolog workaround?
> >>>
> >>> Thanks!
> >>>
> >>>
> >>> On Tue, Aug 30, 2022 at 3:01 PM Brian Andrus  wrote:
> >>>> Not sure if you can do all the things you intend, but the job_submit
> >>>> script is precisely where you want to check submission options.
> >>>>
> >>>> https://slurm.schedmd.com/job_submit_plugins.html
> >>>>
> >>>> Brian Andrus
> >>>>
> >>>> On 8/30/2022 12:58 PM, Davide DelVento wrote:
> >>>>> Hi,
> >>>>>
> >>>>> I would like to soft-enforce license utilization only when the -L is
> >>>>> set. My idea: check in the prolog if the license was requested and
> >>>>> only if it were, set the environmental variables needed for the
> >>>>> license.
> >>>>>
> >>>>> I looked at all environmental variables set by slurm and did not find
> >>>>> any related to the license as I was hoping.
> >>>>>
> >>>>> As a workaround, I could check
> >>>>>
> >>>>> scontrol show job $SLURM_JOB_ID | grep License
> >>>>>
> >>>>> and that would work, but (as discussed in other messages in this list)
> >>>>> the documentation at https://slurm.schedmd.com/prolog_epilog.html say
> >>>>>
> >>>>>> Prolog and Epilog scripts should be designed to be as short as possible
> >>>>>> and should not call Slurm commands (e.g. squeue, scontrol, sacctmgr,
> >>>>>> etc). [...] Slurm commands in these scripts can potentially lead to 
> >>>>>> performance
> >>>>>> issues and should not be used.
> >>>>> This is a bit of a concern, since the prolog would be invoked for
> >>>>> every job on the cluster, and it's a prolog (rather than the epilogue
> >>>>> like discussed in earlier messages).
> >>>>>
> >>>>> So two questions:
> >>>>>
> >>>>> 1) is there a better workaround to check in the prolog if the current
> >>>>> job requested a license and/or
> >>>>> 2) would this kind of use of scontrol be okay or is indeed a concern
> >>>>>
> >>>>> Thanks!
> >>>>>
>

Re: [slurm-users] License management and invoking scontrol in the prolog

2022-09-01 Thread Davide DelVento

Thanks again, Brian, indeed that grep returns many hits, but none of
them includes lua, i.e.

 strings `which slurmctld ` | grep -i job_submit | grep -i lua

returns nothing. So I should use the C rather than the more convenient
lua interface, unless I recompile or am I missing something?

On Thu, Sep 1, 2022 at 12:30 PM Brian Andrus  wrote:
>
> I would be surprised if it were compiled without the support. However,
> you could check and run something like:
>
> strings /sbin/slurmctld | grep job_submit
>
> (or where ever your slurmctld binary is). There should be quite a few
> lines with that in it.
>
> Brian Andrus
>
> On 9/1/2022 10:54 AM, Davide DelVento wrote:
> > Thanks Brian for the suggestion, which I am now exploring.
> >
> > The documentation is a bit cryptic for me, but exploring a few things
> > and checking 
> > https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/
> > I suspect my slurm install (provided by cluster vendor) was not
> > compiled with the lua plugin installed. Do you know how to verify if
> > that is the case or if it's something else? I don't see a way to show
> > if the plugin is actually being "seen" by slurm, and I suspect it's
> > not.
> >
> > Does anyone else have other suggestions or comment on either the
> > plugin or the prolog workaround?
> >
> > Thanks!
> >
> >
> > On Tue, Aug 30, 2022 at 3:01 PM Brian Andrus  wrote:
> >> Not sure if you can do all the things you intend, but the job_submit
> >> script is precisely where you want to check submission options.
> >>
> >> https://slurm.schedmd.com/job_submit_plugins.html
> >>
> >> Brian Andrus
> >>
> >> On 8/30/2022 12:58 PM, Davide DelVento wrote:
> >>> Hi,
> >>>
> >>> I would like to soft-enforce license utilization only when the -L is
> >>> set. My idea: check in the prolog if the license was requested and
> >>> only if it were, set the environmental variables needed for the
> >>> license.
> >>>
> >>> I looked at all environmental variables set by slurm and did not find
> >>> any related to the license as I was hoping.
> >>>
> >>> As a workaround, I could check
> >>>
> >>> scontrol show job $SLURM_JOB_ID | grep License
> >>>
> >>> and that would work, but (as discussed in other messages in this list)
> >>> the documentation at https://slurm.schedmd.com/prolog_epilog.html say
> >>>
> >>>> Prolog and Epilog scripts should be designed to be as short as possible
> >>>> and should not call Slurm commands (e.g. squeue, scontrol, sacctmgr,
> >>>> etc). [...] Slurm commands in these scripts can potentially lead to 
> >>>> performance
> >>>> issues and should not be used.
> >>> This is a bit of a concern, since the prolog would be invoked for
> >>> every job on the cluster, and it's a prolog (rather than the epilogue
> >>> like discussed in earlier messages).
> >>>
> >>> So two questions:
> >>>
> >>> 1) is there a better workaround to check in the prolog if the current
> >>> job requested a license and/or
> >>> 2) would this kind of use of scontrol be okay or is indeed a concern
> >>>
> >>> Thanks!
> >>>
>

Re: [slurm-users] License management and invoking scontrol in the prolog

2022-09-01 Thread Davide DelVento

Thanks Brian for the suggestion, which I am now exploring.

The documentation is a bit cryptic for me, but exploring a few things
and checking 
https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/
I suspect my slurm install (provided by cluster vendor) was not
compiled with the lua plugin installed. Do you know how to verify if
that is the case or if it's something else? I don't see a way to show
if the plugin is actually being "seen" by slurm, and I suspect it's
not.

Does anyone else have other suggestions or comment on either the
plugin or the prolog workaround?

Thanks!


On Tue, Aug 30, 2022 at 3:01 PM Brian Andrus  wrote:
>
> Not sure if you can do all the things you intend, but the job_submit
> script is precisely where you want to check submission options.
>
> https://slurm.schedmd.com/job_submit_plugins.html
>
> Brian Andrus
>
> On 8/30/2022 12:58 PM, Davide DelVento wrote:
> > Hi,
> >
> > I would like to soft-enforce license utilization only when the -L is
> > set. My idea: check in the prolog if the license was requested and
> > only if it were, set the environmental variables needed for the
> > license.
> >
> > I looked at all environmental variables set by slurm and did not find
> > any related to the license as I was hoping.
> >
> > As a workaround, I could check
> >
> > scontrol show job $SLURM_JOB_ID | grep License
> >
> > and that would work, but (as discussed in other messages in this list)
> > the documentation at https://slurm.schedmd.com/prolog_epilog.html say
> >
> >> Prolog and Epilog scripts should be designed to be as short as possible
> >> and should not call Slurm commands (e.g. squeue, scontrol, sacctmgr,
> >> etc). [...] Slurm commands in these scripts can potentially lead to 
> >> performance
> >> issues and should not be used.
> > This is a bit of a concern, since the prolog would be invoked for
> > every job on the cluster, and it's a prolog (rather than the epilogue
> > like discussed in earlier messages).
> >
> > So two questions:
> >
> > 1) is there a better workaround to check in the prolog if the current
> > job requested a license and/or
> > 2) would this kind of use of scontrol be okay or is indeed a concern
> >
> > Thanks!
> >
>

[slurm-users] License management and invoking scontrol in the prolog

2022-08-30 Thread Davide DelVento

Hi,

I would like to soft-enforce license utilization only when the -L is
set. My idea: check in the prolog if the license was requested and
only if it were, set the environmental variables needed for the
license.

I looked at all environmental variables set by slurm and did not find
any related to the license as I was hoping.

As a workaround, I could check

scontrol show job $SLURM_JOB_ID | grep License

and that would work, but (as discussed in other messages in this list)
the documentation at https://slurm.schedmd.com/prolog_epilog.html say

> Prolog and Epilog scripts should be designed to be as short as possible
> and should not call Slurm commands (e.g. squeue, scontrol, sacctmgr,
> etc). [...] Slurm commands in these scripts can potentially lead to 
> performance
> issues and should not be used.

This is a bit of a concern, since the prolog would be invoked for
every job on the cluster, and it's a prolog (rather than the epilogue
like discussed in earlier messages).

So two questions:

1) is there a better workaround to check in the prolog if the current
job requested a license and/or
2) would this kind of use of scontrol be okay or is indeed a concern

Thanks!

86 matches

Mail list logo