On 10/3/20 1:40 pm, mike tie wrote:
Here is the output of lstopo
Hmm, well I believe Slurm should be using hwloc (which provides lstopo)
to get its information (at least it calls the xcpuinfo_hwloc_topo_get()
function for that), so if lstopo works then slurmd should too.
Ah, looking a bit
Yup, I think if you get stuck so badly, the first thing is to make sure the
node does not get the number 10 from the controller, and the second just
reimage the VM fresh. It maybe not the quickest way, but at least
predictable in the sense of time spent.
Good luck!
-kkm
On Wed, Mar 11, 2020 at
On 11-03-2020 20:01, Will Dennis wrote:
I have one cluster running v16.05.4 that I would like to upgrade if
possible to 19.05.5; it was installed via a .deb package I created back
in 2016. I have located a 17.11.7 Ubuntu PPA
(https://launchpad.net/~jonathonf/+archive/ubuntu/slurm) and have
The release notes at https://slurm.schedmd.com/archive/slurm-19.05.5/news.html
indicate you can upgrade from 17.11 or 18.08 to 19.05. I didn’t find equivalent
release notes for 17.11.7, but upgrades over one major release should work.
> On Mar 11, 2020, at 2:01 PM, Will Dennis wrote:
>
>
Hi all,
I have one cluster running v16.05.4 that I would like to upgrade if possible to
19.05.5; it was installed via a .deb package I created back in 2016. I have
located a 17.11.7 Ubuntu PPA
(https://launchpad.net/~jonathonf/+archive/ubuntu/slurm) and have myself
recently put up one for
Dear All,
We have a HPC cluster with Slurm job scheduler (17.02.8). There are several
private partitions (which are sponsored by several groups) and a "common"
partition. Private partitions are exclusively used by those private users, and
all users (including private users) have equal access
Yep, slurmd -C is obviously getting the data from somewhere, either a local
file or from the master node. hence my email to the group; I was hoping
that someone would just say: "yeah, modify file ". But oh well. I'll
start playing with strace and gdb later this week; looking through the
Hello,
I have a queue with 6 servers.
When 4 of the servers are with heavy load, If I send new jobs to the other 2
servers which are free and under different partition and features, The jobs are
still in pending mode (can take them 20 minutes to start running)
If I change their priority with
On Tue, Mar 10, 2020 at 1:41 PM mike tie wrote:
> Here is the output of lstopo
>
> *$* lstopo -p
>
> Machine (63GB)
>
> Package P#0 + L3 (16MB)
>
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#0
>
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#1
>
> L2