On Friday, 17 April 2020 2:22:00 PM PDT Dean Schulze wrote:
> Both work. The only discrepancy is that the slurm controller output had
> these two lines:
>
> UID: ??? (1000)
> GID: ??? (1000)
>
> Like the controller doesn't know the username for UID 1000.
What does thi
Someone else might see more than I do, but from what you’ve posted, it’s clear
that compute-0-0 will be used only after other lower-weighted nodes are too
full to accept a particular job.
I assume you’ve already submitted a set of jobs requesting enough resources to
fill up all the nodes, and t
Hi,
Although compute-0-0 is included in a partition, I have noticed that
no job is offloaded there automatically. If someone intentionally
write --nodelist=compute-0-0 it will be fine.
# grep -r compute-0-0 .
./nodenames.conf.new:NodeName=compute-0-0 NodeAddr=10.1.1.254 CPUs=32
Weight=20511900 Fea
I see potentially 2 things you should likely do:
1. Run ntpd on your nodes. You can even have them sync with your master.
2. Sync your user data on the nodes too. Even if that is just ensuring
/etc/passwd and /etc/group are the same on them all
While ntp is not required for slurm, the time sy