Re: [slurm-users] job_submit_lua improvement

2021-05-10 Thread Luke Yeager
Contributions are usually handled through Bugzilla. Here's how I submitted a patch for job_submit/lua recently: https://bugs.schedmd.com/show_bug.cgi?id=10737 -Original Message- From: slurm-users On Behalf Of AB Sent: Monday, May 10, 2021 2:04 PM To: slurm-users@lists.schedmd.com Subjec

Re: [slurm-users] Fairshare +FairTree Algorithm + TRESBillingWeights

2021-04-05 Thread Luke Yeager
* Rawshare is only a representation of weight where higher value equal to higher priority ? * The total of rawshare need not to be at 100 since it is not percentage? Look at the output of this command on your cluster and things will probably become more clear: sshare -a -format=cluster,

Re: [slurm-users] Slurm - UnkillableStepProgram

2021-03-23 Thread Luke Yeager
While you're looking at this, make sure you don't set UnkillableStepTimeout to a value larger than 126 seconds: https://bugs.schedmd.com/show_bug.cgi?id=11103 From: slurm-users On Behalf Of Yap, Mike Sent: Monday, March 22, 2021 7:13 PM To: slurm-users@lists.schedmd.com Subject: [slurm-users] Sl

Re: [slurm-users] Set Fairshare by Hand

2021-03-22 Thread Luke Yeager
I asked something similar a few months ago and wasn't able to find anything to suit my needs. https://groups.google.com/g/slurm-users/c/ude1M5w_4IU/m/R2GziD9JAQAJ Good luck! Luke -Original Message- From: slurm-users On Behalf Of Paul Edmon Sent: Monday, March 22, 2021 6:31 AM To: slurm

Re: [slurm-users] Unable to get output file once job is completed

2021-02-08 Thread Luke Yeager
The output file is written to the filesystem mounted on the compute node[s], not the control node. Do you have a shared filesystem? Is the output file for your job being written to that shared filesystem? From: slurm-users On Behalf Of Zainul Abiddin Sent: Sunday, February 7, 2021 10:59 PM To:

Re: [slurm-users] Questions about sacctmgr load command

2021-01-11 Thread Luke Yeager
I looked into sacctmgr save/load and remember being very disappointed (unfortunately I can’t remember the details off the top of my head). Ole Nielsen’s open-source tool might be a good thing to take a look at: https://github.com/OleHolmNielsen/Slurm_tools/tree/master/slurmaccounts At our site,

Re: [slurm-users] Setting up slurmrestd

2021-01-08 Thread Luke Yeager
There's this: https://slurm.schedmd.com/rest.html From my personal notes: • Install these packages before re-compiling Slurm (ubuntu 20.04): libhttp-parser-dev, libjwt-dev, and libyaml-dev • Setup JWT using these instructions: https://slurm.schedmd.com/jwt.html • Create /etc/slu

Re: [slurm-users] slurmctld daemon error

2020-12-14 Thread Luke Yeager
What does your ‘slurmctld.service’ look like? You might want to add something to the ‘After=’ section if your service is starting too quickly. e.g. we use ‘After=network.target munge.service’ (see here

Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Luke Yeager
, right? When I tried building from source it didn't do that for me (even as root). Not sure if intended or if I was missing something. Thanks -ave On Thu, Dec 10, 2020, 1:11 PM Luke Yeager mailto:lyea...@nvidia.com>> wrote: Hi Avery, * pmix: we just use the standard Ubuntu packa

Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Luke Yeager
Hi Avery, * pmix: we just use the standard Ubuntu packages on 20.04. Unfortunately the standard packages on 18.04 are too out of date for us. * openmpi: we build our own, using ./configure --with-pmix=internal … * slurm: we build our own, using ./configure --with-pmix=PATH … (see he

[slurm-users] How to assign temporary priority bonuses or penalties?

2020-12-10 Thread Luke Yeager
(originally posted at https://bugs.schedmd.com/show_bug.cgi?id=10322) There are some great tools for assigning discounts or penalties to jobs before they are allocated resources (QOS.UsageFactor, Partition.TRESBillingWeights, etc.). But what if I want to change the cost of a job after the fact?