[slurm-users] How to find the average downtime of compute nodes in a cluster?
Like any node is *down* state (not drng or drain or IDLE or ALLOC) -- Thanks & Regards, Sudeep Narayan Banerjee System Analyst | Scientist B Information System Technology Facility Academic Block 5 | Room 110 Indian Institute of Technology Gandhinagar Palaj, Gujarat 382355 INDIA
Re: [slurm-users] How to get the Average number of CPU cores used by jobs per day?
Dear Steven: Yes, but am unable to get the desired data. Not sure which flags to use. Thanks & Regards, Sudeep Narayan Banerjee On 03/04/20 10:42 am, Steven Dick wrote: Have you looked at sreport? On Fri, Apr 3, 2020 at 1:09 AM Sudeep Narayan Banerjee wrote: How to get the Average number of CPU cores used by jobs per day by a particular group? By group means: say faculty group1, group2 etc. all those groups are having a certain number of students -- Thanks & Regards, Sudeep Narayan Banerjee System Analyst | Scientist B Information System Technology Facility Academic Block 5 | Room 110 Indian Institute of Technology Gandhinagar Palaj, Gujarat 382355 INDIA
Re: [slurm-users] Executing slurm command from Lua job_submit script?
Hi Chansup, could you provde a code snippet? Best Marcus Am 02.04.2020 um 19:43 schrieb CB: Hi, I'm running Slurm 19.05. I'm trying to execute some Slurm commands from the Lua job_submit script for a certain condition. But, I found that it's not executed and return nothing. For example, I tried to execute a "sinfo" command from an external shell script but it didn't work. Does Slurm prohibit to execute any Slurm command from the Lua job_submit command? Thanks, - Chansup
Re: [slurm-users] How to get the Average number of CPU cores used by jobs per day?
Have you looked at sreport? On Fri, Apr 3, 2020 at 1:09 AM Sudeep Narayan Banerjee wrote: > > How to get the Average number of CPU cores used by jobs per day by a > particular group? > > By group means: say faculty group1, group2 etc. all those groups are having a > certain number of students > > -- > Thanks & Regards, > Sudeep Narayan Banerjee > System Analyst | Scientist B > Information System Technology Facility > Academic Block 5 | Room 110 > Indian Institute of Technology Gandhinagar > Palaj, Gujarat 382355 INDIA
[slurm-users] How to get the Average number of CPU cores used by jobs per day?
How to get the Average number of CPU cores used by jobs per day by a particular group? By group means: say faculty group1, group2 etc. all those groups are having a certain number of students -- Thanks & Regards, Sudeep Narayan Banerjee System Analyst | Scientist B Information System Technology Facility Academic Block 5 | Room 110 Indian Institute of Technology Gandhinagar Palaj, Gujarat 382355 INDIA
[slurm-users] Executing slurm command from Lua job_submit script?
Hi, I'm running Slurm 19.05. I'm trying to execute some Slurm commands from the Lua job_submit script for a certain condition. But, I found that it's not executed and return nothing. For example, I tried to execute a "sinfo" command from an external shell script but it didn't work. Does Slurm prohibit to execute any Slurm command from the Lua job_submit command? Thanks, - Chansup
[slurm-users] Interaction between suspension/preemption and core affinity
Hello everyone, I have setup slurm (19.05) to use suspension to be able to pre-empt jobs in a lower priority queue by a higher priority queue. However, jobs aren't resuming as I would expect. If a higher priority job completes and frees up resources for the lower priority jobs, often the suspended jobs don't resume, and instead another pending job in the lower queue gets launched. Furthermore, even when resources are actually available the suspended jobs do not resume. My best guess of what is currently happening is that this is an interaction with task/core affinity. I use the cgroup task plugin to constrain that a job can not use more cores than it requests. However I suspect that the suspended job when launched was e.g. bound to core 1. Now in the above scenarios, presumably the job that pre-empted the suspended job was then bound to core 1. When e.g. core 2 becomes available the suspended job cannot be placed on that core, as it is already bound to core 1, and thus core 2 remains idle even though there is a job waiting. As for my applications, I do care about constraining the total number of cores a job uses to what they requested (to make sure you don't accidentally consume more resources than requested and thus effecting other jobs on the node), but I don't care which core they run on. (These are typically single thread/core jobs, so they don't need appropriate core placements), I was wondering a) if my interpretation of what is happening in scheduling is correct? c) can task/core affinity be reset on suspend/resume? To make sure that a suspended job can resume on any of the available cores on the same node? Can one use cgroups to constrain cores without core affinity? Thank you, Kai Krueger pEpkey.asc Description: application/pgp-keys
Re: [slurm-users] How many users are running jobs per day on average in slurm ?
I would recommend setting up XDMoD as it will calculate this, plus a variety of other useful facts: https://open.xdmod.org/8.5/index.html Also if you like grafana you can use this: https://github.com/fasrc/slurm-diamond-collector -Paul Edmon- On 4/2/2020 8:31 AM, Sudeep Narayan Banerjee wrote: Dear Peter: I am trying with *sacct* and multiple flags.. but am not getting the desired output as per the query... Thanks & Regards, Sudeep Narayan Banerjee On 02/04/20 5:23 pm, Peter Kjellström wrote: On Thu, 2 Apr 2020 16:57:46 +0530 Sudeep Narayan Banerjee wrote: any help in getting the right flags ? You may need to clarify that question a bit... How many users ran jobs on each day? (weekly, monthly average?) How many jobs/per day did each user run? (weekly, monthly average?) And what counts as job activity for a day? Started a job that day? completed a job? Had at least one job running? /Peter
Re: [slurm-users] How many users are running jobs per day on average in slurm ?
Dear Peter: I am trying with *sacct* and multiple flags.. but am not getting the desired output as per the query... Thanks & Regards, Sudeep Narayan Banerjee On 02/04/20 5:23 pm, Peter Kjellström wrote: On Thu, 2 Apr 2020 16:57:46 +0530 Sudeep Narayan Banerjee wrote: any help in getting the right flags ? You may need to clarify that question a bit... How many users ran jobs on each day? (weekly, monthly average?) How many jobs/per day did each user run? (weekly, monthly average?) And what counts as job activity for a day? Started a job that day? completed a job? Had at least one job running? /Peter
Re: [slurm-users] How many users are running jobs per day on average in slurm ?
On 02-04-2020 14:16, Sudeep Narayan Banerjee wrote: Well I am looking for, How many users ran jobs on each day on an average (day average) with at least one job running? Exactly! The NEXT_JOB_ID increases by 1 for every job submitted. You may simply read and store this number every day at 00:00. Put the numbers into an Excel spreadsheet. As I said, you should look into proper Slurm accounting so that you can get answers to relevant questions. /Ole On 02/04/20 5:34 pm, Ole Holm Nielsen wrote: On 02-04-2020 13:27, Sudeep Narayan Banerjee wrote: any help in getting the right flags ? The question is not well-defined. If you just want to know the JobID number in the cluster, you could run this command every day and watch the NEXT_JOB_ID increase: # scontrol show config | grep NEXT_JOB_ID NEXT_JOB_ID = 2377393 Job accounting is probably what you are looking for. You may take a look at my Slurm Wiki page https://wiki.fysik.dtu.dk/niflheim/Slurm_accounting
Re: [slurm-users] How many users are running jobs per day on average in slurm ?
Well I am looking for, How many users ran jobs on each day on an average (day average) with at least one job running? Thanks & Regards, Sudeep Narayan Banerjee On 02/04/20 5:34 pm, Ole Holm Nielsen wrote: On 02-04-2020 13:27, Sudeep Narayan Banerjee wrote: any help in getting the right flags ? The question is not well-defined. If you just want to know the JobID number in the cluster, you could run this command every day and watch the NEXT_JOB_ID increase: # scontrol show config | grep NEXT_JOB_ID NEXT_JOB_ID = 2377393 Job accounting is probably what you are looking for. You may take a look at my Slurm Wiki page https://wiki.fysik.dtu.dk/niflheim/Slurm_accounting /Ole
Re: [slurm-users] How many users are running jobs per day on average in slurm ?
On 02-04-2020 13:27, Sudeep Narayan Banerjee wrote: any help in getting the right flags ? The question is not well-defined. If you just want to know the JobID number in the cluster, you could run this command every day and watch the NEXT_JOB_ID increase: # scontrol show config | grep NEXT_JOB_ID NEXT_JOB_ID = 2377393 Job accounting is probably what you are looking for. You may take a look at my Slurm Wiki page https://wiki.fysik.dtu.dk/niflheim/Slurm_accounting /Ole
Re: [slurm-users] How many users are running jobs per day on average in slurm ?
On Thu, 2 Apr 2020 16:57:46 +0530 Sudeep Narayan Banerjee wrote: > any help in getting the right flags ? You may need to clarify that question a bit... How many users ran jobs on each day? (weekly, monthly average?) How many jobs/per day did each user run? (weekly, monthly average?) And what counts as job activity for a day? Started a job that day? completed a job? Had at least one job running? /Peter
[slurm-users] How many users are running jobs per day on average in slurm ?
any help in getting the right flags ? -- Thanks & Regards, Sudeep Narayan Banerjee System Analyst | Scientist B Information System Technology Facility Academic Block 5 | Room 110 Indian Institute of Technology Gandhinagar Palaj, Gujarat 382355 INDIA