I can't specify memory at all (if I try to give the --mem=2 flag, I get an error message that the requested node configuration isn't available), and the default is 1000MB (well below the amount of RAM on the machines).
On Sun, Feb 12, 2017 at 5:06 PM Carlos Fenoy <mini...@gmail.com> wrote: > Are you specifying a memory limit for your jobs? You haven't set a default > limit per cpu and slurm will allocate all the memory of a node if nothing > else is specified. > > Regards, > Carlos Fenoy > > On Sun, 12 Feb 2017, 22:54 Travis DePrato, <trav...@umich.edu> wrote: > > Yep! Doing everything I can think of, running scontrol reconfigure, > restarting all the relevant daemons, can't seem to get it to work. > > On Sun, Feb 12, 2017 at 4:38 PM Lachlan Musicman <data...@gmail.com> > wrote: > > On 12 February 2017 at 16:06, Travis DePrato <trav...@umich.edu> wrote: > > I've tried multiple variations of the SelectTypeParameters option (before > sending this mail) to no success. > > Currently it's http://pastebin.com/ATcsvvtQ with > SelectTypeParameters=CR_CPU_Memory > > I'm running 10 jobs, each single threaded/processed/etc., just sitting on > "sleep 1000", but I can never get more than 8 to run at a time, and I still > can't memory other than 1. > > > > I always ask the stupid questions: you are changing the conf, distributing > that change to all nodes, restarting slurmctld then running scontrol > reconfigure? > > > cheers > L. > > > ------ > The most dangerous phrase in the language is, "We've always done it this > way." > > - Grace Hopper > > > > > > On Sat, Feb 11, 2017 at 3:26 AM Lachlan Musicman <data...@gmail.com> > wrote: > > 1. As EV noted, to get Memory as a consumable resource, you will need to > add it to the line that says CR_CPU - change to CR_CPU_Memory > https://slurm.schedmd.com/slurm.conf.html > > 2. That's because of the CR_CPU combined with cons_res. Change to CR_CORE > for per core or CR_SOCKET for per socket. For definitions of each, there's > a hardware page: > > https://slurm.schedmd.com/cons_res.html > > but for the cpu/core/socket definition, I found the image at the top of > this page very helpful > > https://slurm.schedmd.com/mc_support.html > > L. > > ------ > The most dangerous phrase in the language is, "We've always done it this > way." > > - Grace Hopper > > On 11 February 2017 at 07:31, E V <eliven...@gmail.com> wrote: > > > man slurm.conf and search for cons_res, you need to make a change from > the defaults. Don't remember the details ATM, but that should get you > started. > > On Fri, Feb 10, 2017 at 2:42 PM, Travis DePrato <trav...@umich.edu> wrote: > > For reference, slurm.conf: http://pastebin.com/XT6TvQhh > > > > I've been tasked with setting up a small cluster for a research group > where > > I work, despite knowing relatively little about HPC or clusters in > general. > > I've installed slurm on the eight compute nodes and the login node, but, > I'm > > having two issues currently: > > > > 1. I cannot specify a memory requirement other than --mem=1 > > Sample submission output with --mem=2: http://pastebin.com/5PY9N6n4 > > > > 2. I cannot get nodes to execute more than one job at a time. The 9th > job is > > always queued with reason Resources. I think this is related to the lines > > > > scontrol: Consumable Resources (CR) Node Selection plugin loaded with > > argument 17 > > scontrol: Serial Job Resource Selection plugin loaded with argument 17 > > scontrol: Linear node selection plugin loaded with argument 17 > > > > because it seems like slurm is only allocating whole nodes at a time. > > > > Sorry if this is basic setup, but I've tried googling to no end. > > -- > > Travis DePrato > > Computer Science & Engineering > > Math and Music Minors > > Student at University of Michigan > > Computer Consultant at EECS DCO > > > -- > Travis DePrato > Computer Science & Engineering > Math and Music Minors > Student at University of Michigan > Computer Consultant at EECS DCO > > -- > Travis DePrato > Computer Science & Engineering > Math and Music Minors > Student at University of Michigan > Computer Consultant at EECS DCO > > -- Travis DePrato Computer Science & Engineering Math and Music Minors Student at University of Michigan Computer Consultant at EECS DCO