Re: [slurm-users] "Low socket*core*thre" - solution?

2018-05-06 Thread Chris Samuel
On Sunday, 6 May 2018 10:09:55 PM AEST Mahmood Naderan wrote: > Still I think for some reasons, slurms put the frontend in drain > state. Maybe, in order not to overload the main node by user jobs, it > set the state to drain which is actually fake. Pretty sure Slurm won't do that, it doesn't car

Re: [slurm-users] "Low socket*core*thre" - solution?

2018-05-06 Thread Chris Samuel
On Sunday, 6 May 2018 10:03:59 PM AEST Mahmood Naderan wrote: > The chassis of the frontend is the same as compute nodes. But slightly different CPUs. > A mother board with two opterons and each have 16 cores. However, the head > node is not included correctly, while the computes are added witho

Re: [slurm-users] "Low socket*core*thre" - solution?

2018-05-06 Thread Mahmood Naderan
Still I think for some reasons, slurms put the frontend in drain state. Maybe, in order not to overload the main node by user jobs, it set the state to drain which is actually fake. I also checked the commands used in the slurm roll (package from Werner) and nothing was incorrect. Similar to settin

Re: [slurm-users] "Low socket*core*thre" - solution?

2018-05-06 Thread Mahmood Naderan
The chassis of the frontend is the same as compute nodes. A mother board with two opterons and each have 16 cores. However, the head node is not included correctly, while the computes are added without problem. [root@rocks7 ~]# grep -R rocks7 /etc/slurm /etc/slurm/partitions.conf.new:PartitionName

Re: [slurm-users] "Low socket*core*thre" - solution?

2018-05-06 Thread Chris Samuel
On Sunday, 6 May 2018 7:28:55 PM AEST Mahmood Naderan wrote: > I also have noticed that State returned back to IDLE+DRAIN! Both you and Eric are having issues with Opteron 6300 series CPUs. I can't help but think the fact that each package in a socket has 2 NUMA nodes is the cause of your pain.

Re: [slurm-users] sacct: error

2018-05-06 Thread Chris Samuel
On Sunday, 6 May 2018 2:58:26 PM AEST Chris Samuel wrote: > Very very interesting - both slurmd and lscpu report 32 cores, but with > differing interpretations of the number of the layout. Meanwhile the AMD > website says these are 16 core CPUs, which means both Slurm and lscpu are > wrong! Of c

Re: [slurm-users] "Low socket*core*thre" - solution?

2018-05-06 Thread Mahmood Naderan
Although this thread belongs to someone else, but a solution may apply for others too. [root@rocks7 ~]# lscpu Architecture: x86_64 CPU op-mode(s):32-bit, 64-bit Byte Order:Little Endian CPU(s):32 On-line CPU(s) list: 0-31 Thread(s) per core:2 Cor

Re: [slurm-users] Finding / compiling "pam_slurm.so" for Ubuntu 16.04

2018-05-06 Thread Gennaro Oliva
Hi Will, On Fri, May 04, 2018 at 06:50:07PM +, Will Dennis wrote: > I built my .deb from the Slurm sources via the following method: > > > · Downloaded the then-current Slurm source ‘slurm-16.05.4.tar.bz2’ > from schedmd.com > > · Renamed & converted to .tar.gz to fit Debia