[slurm-dev] Re: Fix for need to restart slurmctld when adding user to accounting

2016-03-30 Thread Bill Broadley
On 03/30/2016 08:15 PM, Gene Soudlenkov wrote: > > H I don't think this is the case - throughout the code they use > gethostname (not byname) for get the name of the particular host. I didn't track down the source, the documentation claims gethostbyname. To quote the slurm.conf page: Con

[slurm-dev] Re: Fix for need to restart slurmctld when adding user to accounting

2016-03-30 Thread Gene Soudlenkov
H I don't think this is the case - throughout the code they use gethostname (not byname) for get the name of the particular host. On 31/03/16 16:06, Bill Broadley wrote: I think I found the problem and solution. The slurm configuration 15.08 slurm [1] configuration tool mentions: Def

[slurm-dev] Fix for need to restart slurmctld when adding user to accounting

2016-03-30 Thread Bill Broadley
I think I found the problem and solution. The slurm configuration 15.08 slurm [1] configuration tool mentions: Define the hostname of the computer on which the Slurm controller and optional backup controller will execute. You can also specify addresses of these computers if desired (defaul

[slurm-dev] Re: Need to restart slurmctld when adding user to accounting

2016-03-30 Thread Danny Auble
Chris is right. If you ever have this problem it should be fairly clearly marked in both slurmctld and slurmdbd logs when it fails. Usually a firewall like iptables is to blame or different slurm users set in the various .conf files as mentioned before. On March 30, 2016 5:57:19 PM PDT, Gene

[slurm-dev] Re: Need to restart slurmctld when adding user to accounting

2016-03-30 Thread Gene Soudlenkov
We've been having the same problem for years - and we still need to do it. Gene On 31/03/16 13:46, Christopher Samuel wrote: On 31/03/16 11:33, Terri Knight wrote: Upon further testing, I only need restart the slurmctld daemon to get the new user added such that he can run a job. I think wh

[slurm-dev] Re: Need to restart slurmctld when adding user to accounting

2016-03-30 Thread Christopher Samuel
On 31/03/16 11:33, Terri Knight wrote: > Upon further testing, I only need restart the slurmctld daemon to get > the new user added such that he can run a job. I think when you add a user with sacctmgr slurmdbd will try and do an RPC to slurmctld on the registered clusters to inform them of this

[slurm-dev] Re: Need to restart slurmctld when adding user to accounting

2016-03-30 Thread Douglas Jacobsen
Sorry, you just said they were, somehow misread this. Try increasing logging level, perhaps the easiest way is running slurmctld and slurmdbd interactively with the -Dvvv arguments. Then add a user and see if any errors occur, particularly on the slurmctld side after the sacctmgr update is done.

[slurm-dev] Re: Need to restart slurmctld when adding user to accounting

2016-03-30 Thread Douglas Jacobsen
Are both slurmdbd and slurmctld running as the same UID? (if not they need to be, I believe you can see the errors on slurmdbd debug2 or debug3) Doug Jacobsen, Ph.D. NERSC Computer Systems Engineer National Energy Research Scientific Computing Center dmjacob...@lbl.g

[slurm-dev] Need to restart slurmctld when adding user to accounting

2016-03-30 Thread Terri Knight
I posted earlier (Dec 28, 2015) about this issue and was told to check that the slurmdbd and slurmctl daemons were running as the same user- they weren't at that time. I thought making that change would resolve the problem but it did not. These daemons are now both running as root root 6463