[slurm-dev] sacct issues

2016-05-25 Thread Joseph Mingrone
Hello, Running slurm 15.08.11 on FreeBSD 10.3-RELEASE we're seeing issues with sacct not reporting CPU information correctly (or I am misunderstanding what should be reported). Where should we start digging to figure out why NCPUS and AllocCPUS are always 0? Regards, Joseph % sacct

[slurm-dev] Re: Problem when adding user to secondary group

2016-05-25 Thread Chrysovalantis Paschoulas
Hi Thekla, maybe it is not a real bug of slurmd but a caching issue/race. Do you have enabled the CacheGroups option of Slurm? If yes can you try to set CacheGroups=0 and then restart Slurm daemons and tell us if the behavior has changed? Also I would like to see the groups that were set

[slurm-dev] Re: Problem when adding user to secondary group

2016-05-25 Thread Chrysovalantis Paschoulas
With strace you can see what the id command is doing in both cases: 1) When you call without arguments internally is calling getuid and getgid and returns the groups that were set for the current process (bash in this case). You can see the groups that was set in current shell with the

[slurm-dev] Re: Problem when adding user to secondary group

2016-05-25 Thread Thekla Loizou
Hi Valanti, Thanks a lot for your quick reply :) When getting interactive access on a node through SLURM and type the id command with and without arguments the output is different. Please see below: [thekla@node05 ~]$ id uid=2017(thekla) gid=5000(cstrc) groups=5000(cstrc) [thekla@node05 ~]$

[slurm-dev] Re: Problem when adding user to secondary group

2016-05-25 Thread Chrysovalantis Paschoulas
OK Thekla now I understand better what's going on. It really seems to be a problem of Slurm. More specifically, slurmd on the compute nodes which is running as root is changing to the user's uid before it starts the application and during that step it should set the groups (secondary also)

[slurm-dev] Re: NFSv4

2016-05-25 Thread Robbert Eggermont
Hi Mike, On 25-05-16 13:22, Mike Johnson wrote: I am in an environment that uses NFSv4, which obviously needs user credentials to grant access to filesystems. Has anyone else tackled the issue of unattended batch jobs successfully? I'm aware of AUKS. We are using Kerberised NFS4 with Slurm

[slurm-dev] RE: NFSv4

2016-05-25 Thread John Hearns
They've been doing things like this at CERN for donkeys years - with the Andrew File System in the past. Look for Ticket Granting Tickets. Sorry - my memory is getting hazy. -Original Message- From: Mike Johnson [mailto:m.d.john...@durhamonline.org] Sent: 25 May 2016 12:22 To:

[slurm-dev] Re: Problem when adding user to secondary group

2016-05-25 Thread Thekla Loizou
Hi Valanti! :) We are using nslcd on the compute nodes. We have indeed changed the default behavior/command of salloc but I don't think that this is the issue because the same happens when we submit jobs via sbatch. So I believe that this is not related to the new command we are using.

[slurm-dev] NFSv4

2016-05-25 Thread Mike Johnson
Hi all, I know this is a long-standing question, but thought it was worth asking. I am in an environment that uses NFSv4, which obviously needs user credentials to grant access to filesystems. Has anyone else tackled the issue of unattended batch jobs successfully? I'm aware of AUKS. Is

[slurm-dev] Re: Problem when adding user to secondary group

2016-05-25 Thread Chrysovalantis Paschoulas
Hi Thekla! :) For me it looks like it's a configuration issue of the client LDAP name service on the compute nodes. Which service are you using? nslcd or sssd? I can see that you have change the default behavior/command of salloc and the command gives you a prompt on the compute node directly