Yep! Doing everything I can think of, running scontrol reconfigure, restarting all the relevant daemons, can't seem to get it to work.
On Sun, Feb 12, 2017 at 4:38 PM Lachlan Musicman <data...@gmail.com> wrote: > On 12 February 2017 at 16:06, Travis DePrato <trav...@umich.edu> wrote: > > I've tried multiple variations of the SelectTypeParameters option (before > sending this mail) to no success. > > Currently it's http://pastebin.com/ATcsvvtQ with > SelectTypeParameters=CR_CPU_Memory > > I'm running 10 jobs, each single threaded/processed/etc., just sitting on > "sleep 1000", but I can never get more than 8 to run at a time, and I still > can't memory other than 1. > > > > I always ask the stupid questions: you are changing the conf, distributing > that change to all nodes, restarting slurmctld then running scontrol > reconfigure? > > > cheers > L. > > > ------ > The most dangerous phrase in the language is, "We've always done it this > way." > > - Grace Hopper > > > > > > On Sat, Feb 11, 2017 at 3:26 AM Lachlan Musicman <data...@gmail.com> > wrote: > > 1. As EV noted, to get Memory as a consumable resource, you will need to > add it to the line that says CR_CPU - change to CR_CPU_Memory > https://slurm.schedmd.com/slurm.conf.html > > 2. That's because of the CR_CPU combined with cons_res. Change to CR_CORE > for per core or CR_SOCKET for per socket. For definitions of each, there's > a hardware page: > > https://slurm.schedmd.com/cons_res.html > > but for the cpu/core/socket definition, I found the image at the top of > this page very helpful > > https://slurm.schedmd.com/mc_support.html > > L. > > ------ > The most dangerous phrase in the language is, "We've always done it this > way." > > - Grace Hopper > > On 11 February 2017 at 07:31, E V <eliven...@gmail.com> wrote: > > > man slurm.conf and search for cons_res, you need to make a change from > the defaults. Don't remember the details ATM, but that should get you > started. > > On Fri, Feb 10, 2017 at 2:42 PM, Travis DePrato <trav...@umich.edu> wrote: > > For reference, slurm.conf: http://pastebin.com/XT6TvQhh > > > > I've been tasked with setting up a small cluster for a research group > where > > I work, despite knowing relatively little about HPC or clusters in > general. > > I've installed slurm on the eight compute nodes and the login node, but, > I'm > > having two issues currently: > > > > 1. I cannot specify a memory requirement other than --mem=1 > > Sample submission output with --mem=2: http://pastebin.com/5PY9N6n4 > > > > 2. I cannot get nodes to execute more than one job at a time. The 9th > job is > > always queued with reason Resources. I think this is related to the lines > > > > scontrol: Consumable Resources (CR) Node Selection plugin loaded with > > argument 17 > > scontrol: Serial Job Resource Selection plugin loaded with argument 17 > > scontrol: Linear node selection plugin loaded with argument 17 > > > > because it seems like slurm is only allocating whole nodes at a time. > > > > Sorry if this is basic setup, but I've tried googling to no end. > > -- > > Travis DePrato > > Computer Science & Engineering > > Math and Music Minors > > Student at University of Michigan > > Computer Consultant at EECS DCO > > > -- > Travis DePrato > Computer Science & Engineering > Math and Music Minors > Student at University of Michigan > Computer Consultant at EECS DCO > > -- Travis DePrato Computer Science & Engineering Math and Music Minors Student at University of Michigan Computer Consultant at EECS DCO