Hello,
Running slurm 15.08.11 on FreeBSD 10.3-RELEASE we're seeing issues with sacct
not reporting CPU information correctly (or I am misunderstanding what should be
reported). Where should we start digging to figure out why NCPUS and AllocCPUS
are always 0?
Regards,
Joseph
% sacct --format=jo
Hi Thekla,
maybe it is not a real bug of slurmd but a caching issue/race. Do you
have enabled the CacheGroups option of Slurm? If yes can you try to set
CacheGroups=0 and then restart Slurm daemons and tell us if the behavior
has changed?
Also I would like to see the groups that were set wh
With strace you can see what the id command is doing in both cases:
1) When you call without arguments internally is calling getuid and
getgid and returns the groups that were set for the current process
(bash in this case). You can see the groups that was set in current
shell with the command
Hi Valanti,
Thanks a lot for your quick reply :)
When getting interactive access on a node through SLURM and type the id
command with and without arguments the output is different.
Please see below:
[thekla@node05 ~]$ id
uid=2017(thekla) gid=5000(cstrc) groups=5000(cstrc)
[thekla@node05 ~]$ i
OK Thekla now I understand better what's going on.
It really seems to be a problem of Slurm. More specifically, slurmd on
the compute nodes which is running as root is changing to the user's uid
before it starts the application and during that step it should set the
groups (secondary also) bu
Hi Mike,
On 25-05-16 13:22, Mike Johnson wrote:
I am in an environment that uses NFSv4, which obviously needs
user credentials to grant access to filesystems. Has anyone else
tackled the issue of unattended batch jobs successfully? I'm aware of
AUKS.
We are using Kerberised NFS4 with Slurm
They've been doing things like this at CERN for donkeys years - with the Andrew
File System in the past.
Look for Ticket Granting Tickets. Sorry - my memory is getting hazy.
-Original Message-
From: Mike Johnson [mailto:m.d.john...@durhamonline.org]
Sent: 25 May 2016 12:22
To: slurm-dev
Hi Valanti! :)
We are using nslcd on the compute nodes.
We have indeed changed the default behavior/command of salloc but I
don't think that this is the issue because the same happens when we
submit jobs via sbatch. So I believe that this is not related to the new
command we are using.
When
Hi all,
I know this is a long-standing question, but thought it was worth
asking. I am in an environment that uses NFSv4, which obviously needs
user credentials to grant access to filesystems. Has anyone else
tackled the issue of unattended batch jobs successfully? I'm aware of
AUKS. Is there
Hi Thekla! :)
For me it looks like it's a configuration issue of the client LDAP name
service on the compute nodes. Which service are you using? nslcd or
sssd? I can see that you have change the default behavior/command of
salloc and the command gives you a prompt on the compute node directly
(b
Dear all,
We have noticed a very strange problem every time we add an existing
user to a secondary group.
We manage our users in LDAP. When we add a user to a new group and then
type the "id" and "groups" commands we see that the user was indeed
added to the new group. The same happens when r
11 matches
Mail list logo