Managed to narrow it down a little bit. Our groups file is pretty large and we 
have a handful of individual groups that are also quite large as shown below

[root@batch1 ~]# wc /etc/group
  6075   6075 349457 /etc/group

[root@batch1 ~]# grep 8xxx2 /etc/group | wc -c
56959

It looks like one of the recent changes 
(https://github.com/SchedMD/slurm/commit/e1b4cdba70f7f1b5ac5335c572d9c4c79e6e1259)
 migrated the old uid check to the dedicated `gid_from_uid` function. However, 
an important change with that migration is that we've lost this part of the old 
loop:

```
                        if (errno == ERANGE) {
                                buflen *= 2;
                                xrealloc(buf, buflen);
                                continue;
                        }
```

In doing so I think we're hitting a buffer limit. Trimming down our groups 
enough can get us back to normal operations, but unfortunately that's not a 
tenable solution.

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to