Can you give the exact command/output you have from this?
I suspect a typo in your slurm.conf for nodenames or what you are typing.
Brian Andrus
On 6/18/2019 11:29 PM, nathan norton wrote:
Hi,
It just shows
"Node $NODE not found"
Whereas others all work as expected (ie, they are running)
Without knowing the internals of slurm it feels like nodes that are
turned off+cloud state don't exist in the system until they are on?
Any other ideas?
Thanks
Nathan
On Wed., 19 Jun. 2019, 4:21 pm Chris Samuel, <ch...@csamuel.org
<mailto:ch...@csamuel.org>> wrote:
On Tuesday, 18 June 2019 9:36:56 PM PDT nathan norton wrote:
> Just tried running that command, but it only shows nodes that
are up and
> running, doesn’t tell me about any nodes that are down and
turned off, as
> an example please see below. There is a job running that should
be using
> the 100 nodes but only 52 are allocated (plus 2 down* (that I
know about
> and don’t care about in this case)) where are the stats and
details on why
> the 40ish other nodes are not being used? (nothing in the
masters log file
> either)
I suspect this is related to their cloud state.
What does "scontrol show node $NODE" say where $NODE is the name
of a node
that isn't being listed despite you expecting it to be?
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA