[slurm-users] practical tips to budget cluster expansion for a research center with heterogeneous workloads?

2019-03-21 Thread Graziano D'Innocenzo
Dear Slurm users, my team is managing a HPC cluster (running Slurm) for a research centre. We are planning to expand the cluster in the next couple of years and we are facing a problem. We would like to put a figure on how many resources will be needed on average for each user (in terms of CPU cor

Re: [slurm-users] practical tips to budget cluster expansion for a research center with heterogeneous workloads?

2019-03-21 Thread Alex Chekholko
Hey Graziano, To make your decision more "data-driven", you can pipe your SLURM accounting logs into a tool like XDMOD which will make you pie charts of usage by user, group, job, gres, etc. https://open.xdmod.org/8.0/index.html You may also consider assigning this task to one of your "machine

Re: [slurm-users] practical tips to budget cluster expansion for a research center with heterogeneous workloads?

2019-03-21 Thread Noam Bernstein
> On Mar 21, 2019, at 12:38 PM, Alex Chekholko wrote: > > Hey Graziano, > > To make your decision more "data-driven", you can pipe your SLURM accounting > logs into a tool like XDMOD which will make you pie charts of usage by user, > group, job, gres, etc. > > https://open.xdmod.org/8.0/inde

Re: [slurm-users] practical tips to budget cluster expansion for a research center with heterogeneous workloads?

2019-03-21 Thread Alex Chekholko
Hi Noam, Right, xdmod is a standard LAMP stack webapp. You can see some pictures of the graphs in the web interface in a google image search here https://www.google.com/search?q=xdmod&source=lnms&tbm=isch&sa=X It may also require a fairly beefy database backend, depending on how many millions of

Re: [slurm-users] practical tips to budget cluster expansion for a research center with heterogeneous workloads?

2019-03-26 Thread Graziano D'Innocenzo
Dear all, thanks everybody for your comments. XDMOD looks definitely like the way to go. We will see to get that deployed. Thanks again for your help, -- Graziano D'Innocenzo (PGP key: 9213BE46) Systems Administrator - ADAPT Centre On Thu, Mar 21, 2019 at 5:16 PM Alex Chekholko wrote: > > Hi Noa