[slurm-dev] Re: Finding out total number of cores in the cluster

2015-11-15 Thread Mikael Johansson
Hello, Well, to get the total number that SLURM is aware of, a simple command would be "sinfo -o %C", which shows Allocated/Idle/Offline/Total CPUs. Cheers, Mikael J. http://www.iki.fi/~mpjohans/ On Sun, 15 Nov 2015, Gene Soudlenkov wrote: > > Hi, > > Is there a way to find total numb

[slurm-dev] Re: Allocating Resources Fully until last node

2014-12-11 Thread Mikael Johansson
Hello there, On Thu, 11 Dec 2014, Thompson, Matt[SCIENCE SYSTEMS AND APPLICATIONS INC] wrote: However, SLURM is actually allocating for us so that if I ask for --ntasks=100, I get out a more balanced load: SLURM_TASKS_PER_NODE=12,11(x8) So I was wondering: is there a way with salloc/sbat

[slurm-dev] (ReqNodeNotAvail) w/ multiple partitions error in pre-14.11

2014-12-11 Thread Mikael Johansson
v 2014 03:34:59 -0800 From: Mikael Johansson To: slurm-dev Subject: [slurm-dev] Re: Odd (ReqNodeNotAvail) and (PartitionNodeLimit) with multiple partitions Hello All, Just an update on this, it seems it indeed was just a bug in the old 2.2.7 version of SLURM; after an upgrade to 14.11.0,

[slurm-dev] 14.11 Raw Usage update problem

2014-12-10 Thread Mikael Johansson
as a documented work-around. Cheers, Mikael J. http://www.iki.fi/~mpjohans/ On Thu, 27 Nov 2014, Mikael Johansson wrote: Hello again, A bit more detail, in case it helps. It seems that the Raw Usage is only updated based on the longest running job on the system (or perhaps only for the

[slurm-dev] Re: Upgrading from 2.2.7 -> version 14, Raw Usage problem

2014-11-27 Thread Mikael Johansson
user with the new "oldest job". Still, any suggestion very much appreciated. Cheers, Mikael J. On Thu, 27 Nov 2014, Mikael Johansson wrote: Hello All, OK, so yesterday we upgraded to 14.11.0. Everything went rather smoothly, except for one thing: Accounting of Raw Usage see

[slurm-dev] Re: Odd (ReqNodeNotAvail) and (PartitionNodeLimit) with multiple partitions

2014-11-27 Thread Mikael Johansson
Hello All, Just an update on this, it seems it indeed was just a bug in the old 2.2.7 version of SLURM; after an upgrade to 14.11.0, nodes shared by three partitions do not confuse SLURM anymore. Cheers, Mikael J. http://www.iki.fi/~mpjohans/ On Tue, 21 Oct 2014, Mikael Johansson

[slurm-dev] Re: Upgrading from 2.2.7 -> version 14, Raw Usage problem

2014-11-27 Thread Mikael Johansson
Hello All, OK, so yesterday we upgraded to 14.11.0. Everything went rather smoothly, except for one thing: Accounting of Raw Usage seems not to be working properly. It works partly, yesterday it seemed to be working for all users, for the moment only the Raw Usage for one user is updated, w

[slurm-dev] Re: Upgrading from 2.2.7 -> version 14

2014-11-01 Thread Mikael Johansson
Mikael Johansson : Hello All, We have a power-out soon on our cluster, and I thought now would be a good opportunity to update SLURM. Just three quick questions to ease my mind: 1. Is an upgrade from version 2.2.7 to 14.03 expected to work "in-place" without problems, that is, wi

[slurm-dev] Upgrading from 2.2.7 -> version 14

2014-10-31 Thread Mikael Johansson
Hello All, We have a power-out soon on our cluster, and I thought now would be a good opportunity to update SLURM. Just three quick questions to ease my mind: 1. Is an upgrade from version 2.2.7 to 14.03 expected to work "in-place" without problems, that is, without loss of the user datab

[slurm-dev] Odd (ReqNodeNotAvail) and (PartitionNodeLimit) with multiple partitions

2014-10-21 Thread Mikael Johansson
Hello All, I had a problem with jobs being stuck in the queue and not being scheduled even with unused cores on the cluster. The system has four partitions, three different "high priority" ones and one lower priority, "backfill" partition. A concise description of the setup in slurm.config,

[slurm-dev] Re: Partition for unused resources until needed by any other partition

2014-10-21 Thread Mikael Johansson
Thanks! That looks like something that could be useful indeed. We are for the moment stuck with version 2.2.7, though, and if I understood the docs correctly, most of the partition based parameters are of later date and versions. We might upgrade in some future, though. It also seems like

[slurm-dev] Re: Partition for unused resources until needed by any other partition

2014-10-20 Thread Mikael Johansson
ing, but then immediately decreases it when the jobs start? Cheers, Mikael J. http://www.iki.fi/~mpjohans/ On Mon, 20 Oct 2014, je...@schedmd.com wrote: This should help: http: //slurm.schedmd.com/preempt.html Quoting Mikael Johansson : Hello All, I've been scratching my he

[slurm-dev] Partition for unused resources until needed by any other partition

2014-10-20 Thread Mikael Johansson
Hello All, I've been scratching my head for a while now trying to figure this one out, which I would think would be a rather common setup. I would need to set up a partition (or whatever, maybe a partition is actually not the way to go) with the following properties: 1. If there are any u